perm filename V245.TEX[TEX,DEK]8 blob
sn#419334 filedate 1979-02-23 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00033 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00005 00002 \input acphdr % Section 4.5
C00006 00003 %folio 412 galley 4b (C) Addison-Wesley 1978 *
C00022 00004 %folio 415 galley 1 (C) Addison-Wesley 1978 *
C00033 00005 %folio 417 galley 2 (C) Addison-Wesley 1978 *
C00045 00006 %folio 419 galley 3 (C) Addison-Wesley 1978 *
C00053 00007 %folio 421 galley 4 WARNING: Much tape unreadable! (C) Addison-Wesley 1978 *
C00063 00008 %folio 423 galley 5 (C) Addison-Wesley 1978 *
C00073 00009 %folio 426 galley 6 (C) Addison-Wesley 1978 *
C00088 00010 %folio 429 galley 7 (C) Addison-Wesley 1978 *
C00101 00011 %folio 432 galley 8 (C) Addison-Wesley 1978 *
C00113 00012 %folio 434 galley 9 (C) Addison-Wesley 1978 *
C00125 00013 %folio 439 galley 10 WARNING: Some bad spots. (C) Addison-Wesley 1978 *
C00135 00014 %folio 441 galley 11 (C) Addison-Wesley 1978 *
C00150 00015 %folio 444 galley 12 (C) Addison-Wesley 1978 *
C00162 00016 %folio 448 galley 13 (C) Addison-Wesley 1978 *
C00174 00017 %folio 451 galley 14 (C) Addison-Wesley 1978 *
C00184 00018 %folio 460 galley 15 (C) Addison-Wesley 1978 *
C00191 00019 %folio 461 galley 16 (C) Addison-Wesley 1978 *
C00202 00020 %folio 465 galley 1 Bad spots. (C) Addison-Wesley 1978 *
C00212 00021 %folio 468 galley 2 Bad spots. (C) Addison-Wesley 1978 *
C00230 00022 %folio 473 galley 3 Bad spots. (C) Addison-Wesley 1978 *
C00247 00023 %folio 476 galley 4 Mostly unreadable tape. (C) Addison-Wesley 1978 *
C00263 00024 %folio 479 galley 5 Tape mostly unreadable. (C) Addison-Wesley 1978 *
C00275 00025 %folio 482 galley 6 Mostly hopeless. (C) Addison-Wesley 1978 *
C00283 00026 %folio 488 galley 7 Total loss. (C) Addison-Wesley 1978 *
C00296 00027 %folio 492 galley 8 Bad spots. (C) Addison-Wesley 1978 *
C00312 00028 %folio 496 galley 9 Bad spots. (C) Addison-Wesley 1978 *
C00329 00029 %folio 500 galley 10 Bad spots. (C) Addison-Wesley 1978 *
C00346 00030 %folio 505 galley 11 (C) Addison-Wesley 1978 *
C00348 00031 %folio 508 galley 12 Bad beginning. (C) Addison-Wesley 1978 *
C00362 00032 %folio 512 galley 13 Total loss. (C) Addison-Wesley 1978 *
C00381 00033 \vfill\eject
C00382 ENDMK
C⊗;
\input acphdr % Section 4.5
\runninglefthead{ARITHMETIC} % chapter title
\titlepage\setcount00
\null
\vfill
\tenpoint
\ctrline{SECTION 4.5 of THE ART OF COMPUTER PROGRAMMING}
\ctrline{$\copyright$ 1978 Addison--Wesley Publishing Company, Inc.}
\vfill
\runningrighthead{RATIONAL ARITHMETIC}
\section{4.5}
\eject
\setcount0 310
%folio 412 galley 4b (C) Addison-Wesley 1978 *
\sectionbegin{4.5. RATIONAL ARITHMETIC}
I{\:cT IS OFTEN} important to know that the answer to some numerical problem
is exactly ${1\over 3}$,
not a floating-point number that gets printed as ``0.333333574''.
If arithmetic is done on fractions instead of on approximations
to fractions, many computations can be done entirely {\sl without
any accumulated rounding errors.} This results in a comfortable
feeling of security that is often lacking when floating-point
calculations have been made, and it means that the accuracy
of the calculation cannot be improved upon.
\runningrighthead{FRACTIONS}
\section{4.5.1}
\sectionskip
\sectionbegin{4.5.1. Fractions}
When fractional arithmetic is desired, the numbers can be represented
as pairs of integers, $(u/u↑\prime )$, where $u$ and $u↑\prime$
are relatively prime to each other and $u↑\prime > 0$. The number
zero is represented as $(0/1)$. In this form, $(u/u↑\prime ) =
(v/v↑\prime )$ if and only if $u = v$ and $u↑\prime =
v↑\prime $.
Multiplication of fractions is, of course, easy;
to form $(u/u↑\prime ) \times (v/v↑\prime ) = (w/w↑\prime
)$, we can simply compute $uv$ and $u↑\prime v↑\prime $. The
two products $uv$ and $u↑\prime v↑\prime$ might not be relatively
prime, but if $d =\gcd(uv, u↑\prime v↑\prime )$, the desired
answer is $w = uv/d$, $w↑\prime =u↑\prime v↑\prime /d$.\xskip (See exercise
2.)\xskip Efficient algorithms to compute the greatest common divisor
are discussed in Section 4.5.2.
Another way to perform the multiplication is to find $d↓1 =
\gcd(u, v↑\prime )$ and $d↓2 =\gcd(u↑\prime , v)$; then the
answer is $w = (u/d↓1)(v/d↓2)$, $w↑\prime = (u↑\prime /d↓2)(v↑\prime
/d↓1)$.\xskip (See exercise 3.)\xskip This method requires two gcd calculations,
but it is not really slower than the former method; the gcd
process involves a number of iterations that is essentially proportional
to the logarithm of its inputs, so the total number of iterations
needed to evaluate both $d↓1$ and $d↓2$ is essentially the same
as the number of iterations during the single calculation of
$d$. Furthermore, each iteration in the evaluation of $d↓1$ and
$d↓2$ is potentially faster, because comparatively small numbers
are being examined. If $u$, $u↑\prime$, $v$, and $v↑\prime$ are single-precision
quantities, this method has the advantage that no double-precision
numbers appear in the calculation unless it is impossible to
represent both of the answers $w$ and $w↑\prime$ in single-precision
form.
Division may be done in a similar manner; see exercise 4.
Addition and subtraction are slightly more complicated. The obvious procedure is to
set $(u/u↑\prime ) \pm (v/v↑\prime ) = \biglp (uv↑\prime \pm
u↑\prime v)/u↑\prime v↑\prime \bigrp$ and then to reduce this fraction
to lowest terms by calculating $d =\gcd(uv↑\prime \pm u↑\prime
v, u↑\prime v↑\prime )$ as in the first multiplication method.
But again it is possible to avoid working with such large numbers,
if we start by calculating $d↓1 =\gcd(u↑\prime , v↑\prime
)$. If $d↓1 = 1$ then $w = uv↑\prime \pm u↑\prime v$ and $w↑\prime
= u↑\prime v↑\prime$ are the desired numerator and denominator.\xskip
(According to Theorem 4.5.2D\null, $d↓1$ will be 1 about 61 percent
of the time, if the denominators $u↑\prime$ and $v↑\prime$ are
randomly distributed, so it is wise to single out this case
separately.)\xskip If $d↓1 > 1$, then let $t = u(v↑\prime /d↓1)\pm
v(u↑\prime /d↓1)$ and calculate $d↓2 =\gcd(t, d↓1)$; finally
the answer is $w = t/d↓2$, $w↑\prime = (u↑\prime /d↓1)(v↑\prime
/d↓2)$.\xskip (Exercise 6 proves that these values of $w$ and $w↑\prime$
are relatively prime to each other.)\xskip If single-precision numbers
are being used, this method requires only single-precision operations,
except that $t$ may be a double-precision number or slightly
larger (see exercise 7); since $\gcd(t, d↓1) =\gcd(t\mod d↓1,
d↓1)$, the calculation of $d↓2$ does not require double precision.
For example, to compute $(7/66) + (17/12)$, we form $d↓1 =\gcd(66,
12) = 6$; then $t = 7 \cdot 2 + 17 \cdot 11 = 201$, and $d↓2
=\gcd(201, 6) = 3$, so the answer is
$${201\over 3}\left/{66\over 6}\,{12\over 3} = 67/44\right..$$
Experience with fractional calculations
shows that in many cases the numbers grow to be quite large. So if
$u$ and $u↑\prime$ are intended to be single-precision numbers
for each fraction $(u/u↑\prime )$, it is important to include
tests for overflow in each of the addition, subtraction, multiplication,
and division subroutines. For numerical problems in which perfect
accuracy is important, a set of subroutines for fractional arithmetic
with {\sl arbitrary} precision allowed in numerator and denominator
is very useful.
The methods of this section extend also to other number fields
besides the rational numbers; for example, we could do arithmetic
on quantities of the form $(u + u↑\prime \sqrt{5}\,)/u↑{\prime\prime}
$, where $u$, $u↑\prime$, $u↑{\prime\prime}$ are integers, $\gcd(u, u↑\prime
, u↑{\prime\prime} ) = 1$, and $u↑{\prime\prime} > 0$; or on quantities of the
form $(u + u↑\prime \spose{\raise5pt\hbox{\hskip2.5pt$\scriptscriptstyle3$}}
\sqrt2+ u↑{\prime\prime}\spose{\raise5pt\hbox{\hskip2.5pt$\scriptscriptstyle3$}}
\sqrt4\,)/u↑{\prime\prime\prime}$, etc.
To help check out subroutines for rational arithmetic, inversion
of matrices with known inverses (e.g.,
Cauchy matrices, exercise 1.2.3--41) is suggested.
Exact representation of fractions within a computer was first
discussed in the literature by P. Henrici, {\sl JACM \bf 3}
(1956), 6--9.
\exbegin{EXERCISES}
\exno 1. [15] Suggest
a reasonable computational method for comparing two fractions,
to test whether or not $(u/u↑\prime ) < (v/v↑\prime )$.
\exno 2. [M15] Prove that if $d
=\gcd(u, v)$ then $u/d$ and $v/d$ are relatively prime.
\exno 3. [M20] Prove that if $u$ and $u↑\prime$ are relatively
prime, and if $v$ and $v↑\prime$ are relatively prime, then
$\gcd(uv, u↑\prime v↑\prime ) =\gcd(u, v↑\prime )\gcd(u↑\prime
, v)$.
\exno 4. [11] Design a division algorithm for fractions, analogous
to the second multiplication method of the text.\xskip (Note that
the sign of $v$ must be considered.)
\exno 5. [10] Compute $(17/120) + (-27/70)$ by the method recommended
in the text.
\trexno 6. [M23] Show that if $u, u↑\prime$ are relatively prime
and if $v, v↑\prime$ are relatively prime, then $\gcd(uv↑\prime
+ vu↑\prime , u↑\prime v↑\prime ) = d↓1d↓2$, where $d↓1 = \gcd(u↑\prime
, v↑\prime )$ and $d↓2 = \gcd(d↓1, u(v↑\prime /d↓1) + v(u↑\prime
/d↓1))$.\xskip (Hence if $d↓1 = 1$, then $uv↑\prime + vu↑\prime$ is relatively
prime to $u↑\prime v↑\prime$.)
\exno 7. [M22] How large can the
absolute value of the quantity $t$ become, in the addition-subtraction
method recommended in the text, if the input numerators and
denominators are less than $N$ in absolute value?
\trexno 8. [22] Discuss using $(1/0)$ and $(-1/0)$ as representations
for $∞$ and $-∞$, and/or as representations of overflow.
\exno 9. [M23] If $1 ≤ u↑\prime
$, $v↑\prime < 2↑n$, show that $\lfloor 2↑{2n}u/u↑\prime \rfloor
= \lfloor 2↑{2n}v/v↑\prime \rfloor$ implies $u/u↑\prime = v/v↑\prime
$.
\exno 10. [41] Extend the subroutines suggested in exercise
4.3.1--34 so that they deal with ``arbitrary'' rational numbers.
\exno 11. [M23] Consider fractions of the form $(u + u↑\prime
\sqrt5\,)/u↑{\prime\prime} $, where $u$, $u↑\prime$,
$u↑{\prime\prime}$ are integers, $\gcd(u, u↑\prime , u↑{\prime\prime} ) = 1$,
and $u↑{\prime\prime} > 0$. Explain how to divide two such fractions
and to obtain a quotient having the same form.
\trexno 12. [M40] (David W. Matula.)\xskip Consider ``fixed-slash'' and
``floating-slash'' numbers, which are analogous to floating-point numbers
but based on general fractions instead of on radix point positions.
In a fixed-slash scheme, the numerator and denominator of a representable
fraction each consist of at most $p$ bits, for some given $p$. In a floating-slash
scheme, the {\sl sum} of numerator bits plus denominator bits must be a total of
at most $q$, for some given $q$, and another field of the representation is used
to indicate how many of these $q$ bits belong to the numerator. To do
arithmetic on such numbers, we define $x\oplus y=\hbox{round}(x+y)$,
$x\ominus y=\hbox{round}(x-y)$, etc., where round$(x)$ is a representable
number closest to $x$.
For example, suppose we have a calculation that takes the form ${1\over3}=
{82\over173}-{73\over519}$, but we are using fixed-slash arithmetic with $p=5$
so that all numerators and denominators must be less than 32; the
intermediate quantities have to be rounded to the nearest representable numbers.
In this case $82\over173$ would round to $9\over19$ and $73\over519$ would
round to $1\over7$; then ${9\over19}-{1\over7}={44\over133}$ would round to
$1\over3$. Similarly, if we were using floating-slash arithmetic with
$q=13$, it turns out that $82\over173$ would round to $55\over116$, and $73\over519$
to $9\over64$; once again the difference ${5\over116}-{9\over64}={619\over1856}$
would round to $1\over3$. In both cases all the rounding errors cancel out,
indicating that if the true answer is a simple fraction we tend to get it
{\sl exactly} with ``slash arithmetic,'' in spite of the fact that intermediate
calculations are inaccurate.
Experiment with slash arithmetic in a variety of calculations. For example,
try to determine how many rational numbers $x$ and $y$ have the property that
$x-y={1\over3}$ but $\hbox{round}(x)\ominus\hbox{round}(y)≠{1\over3}$ in
fixed-slash or floating-slash arithmetic.\xskip[{\sl References:} D. W.
Matula, in {\sl Applications of Number Theory to Numerical Analysis},
S. K. Zaremba, ed.\ (New York: Academic Press, 1972), 486--489; D. W. Matula
and Peter Kornerup, {\sl Proc.\ IEEE Symp.\ Computer Arith.\ \bf4} (1978),
to appear.]
%folio 415 galley 1 (C) Addison-Wesley 1978 *
\runningrighthead{THE GREATEST COMMON DIVISOR}
\section{4.5.2}
\sectionskip
\sectionbegin{4.5.2. The Greatest Common Divisor}
If $u$ and $v$ are integers, not both zero, we
say that their {\sl greatest common divisor}, $\gcd(u, v)$, is
the largest integer that evenly divides both $u$ and $v$. This
definition makes sense, because if $u ≠ 0$ then no integer greater
than $|u|$ can evenly divide $u$, but the integer 1 does divide
both $u$ and $v$; hence there must be a largest integer that
divides them both. When $u$ and $v$ are both zero, every integer
evenly divides zero, so the above definition does not apply;
it is convenient to set
$$\baselineskip18pt\eqalignno{\gcd(0, 0) ⊗= 0.⊗(1)\cr
\noalign{\hbox{The definitions just given obviously imply that}}
\gcd(u, v) ⊗=\gcd(v, u),⊗(2)\cr
\gcd(u, v) ⊗=\gcd(-u, v),⊗(3)\cr
\gcd(u, 0) ⊗=|u|.⊗(4)\cr}$$
In the previous section, we reduced the problem
of expressing a rational number in ``lowest terms'' to the problem
of finding the greatest common divisor of its numerator and
denominator. Other applications of the greatest common divisor
have been mentioned for example in Sections 3.2.1.2, 3.3.3,
4.3.2, 4.3.3. So the concept of $\gcd(u,v)$
is important and worthy of serious study.
The {\sl least common multiple} of two integers
$u$ and $v$, written $\lcm(u, v)$, is a related idea that is
also important. It is defined to be the smallest positive integer
that is a multiple of (i.e., evenly divisible by) both $u$
and $v$; and $\lcm(0, 0) = 0$. The classical method for teaching
children how to add fractions $u/u↑\prime + v/v↑\prime$ is to
train them to find the ``least common denominator,'' which is
$\lcm(u↑\prime , v↑\prime )$.
According to the ``fundamental theorem of arithmetic''
(proved in exercise 1.2.4--21), each positive integer $u$ can
be expressed in the form
$$u = 2↑{u↓2}3↑{u↓3}5↑{u↓5}7↑{u↓7}11↑{u↓{11}}
\ldots = \prod ↓{p\,\,\hbox{\:d prime}}p↑{u↓p},\eqno (5)$$
where the exponents $u↓2$, $u↓3$, $\ldots$ are uniquely
determined nonnegative integers, and where all but a finite
number of the exponents are zero. From this canonical factorization
of a positive integer, it is easy to discover one way to compute
the greatest common divisor of $u$ and $v$: By (2), (3), and
(4), we may assume that $u$ and $v$ are positive integers, and
if both of them have been canonically factored into primes
we have
$$\eqalignno{\gcd(u, v) ⊗ = \prod ↓{p\,\,\hbox{\:d prime}}p↑{\min(u↓p,v↓p)},⊗(6)\cr
\lcm(u, v) ⊗= \prod ↓{p\,\,\hbox{\:d prime}}p↑{\max(u↓p,v↓p)}.⊗(7)\cr}$$
Thus, for example, the greatest common divisor of $u = 7000
= 2↑3 \cdot 5↑3 \cdot 7$ and $v = 4400 = 2↑4 \cdot 5↑2 \cdot
11$ is $2↑{\min(3,4)}\,5↑{\min(3,2)}\,7↑{\min(1,0)}\,11↑{\min(0,1)} = 2↑3
\cdot 5↑2 = 200$. The least common multiple of the same two numbers
is $2↑4 \cdot 5↑3 \cdot 7 \cdot 11 = 154000$.
From formulas (6) and (7) we can easily prove a
number of basic identities concerning the gcd and the lcm:
$$\vbox{\tabskip 0pt plus 1000pt minus 1000pt \baselineskip 15pt
\halign to size{\hfill$\dispstyle{#}$\tabskip 0pt
⊗$\dispstyle{\null#}$\hfill\qquad⊗#\hfill\tabskip 0 pt plus 1000pt minus 1000pt
⊗\hfill$ # $\tabskip 0pt\cr
\gcd(u, v)w ⊗=\gcd(uw, vw),⊗if $w ≥ 0$;⊗(8)\cr
\lcm(u, v)w ⊗=\lcm(uw, vw),⊗if $w ≥ 0$;⊗(9)\cr
u \cdot v ⊗=\gcd(u, v) \cdot \lcm(u, v),⊗if $u,v ≥ 0$;⊗(10)\cr
\gcd\biglp\lcm(u, v),\lcm(u, w)\bigrp ⊗=\lcm\biglp
u,\gcd(v, w)\bigrp;⊗⊗(11)\cr
\lcm\biglp\gcd(u,v),\gcd(u,w)\bigrp⊗=\gcd\biglp
u,\lcm(v,w)\bigrp.⊗⊗(12)\cr}}$$
The latter two formulas are ``distributive laws''
analogous to the familiar identity $uv + uw = u(v + w)$. Equation
(10) reduces the calculation of $\gcd(u, v)$ to the calculation
of $\lcm(u, v)$, and conversely.
\subsectionbegin{Euclid's algorithm} Although Eq.\
(6) is useful for theoretical purposes, it is generally no help
for calculating a greatest common divisor in practice, because
it requires that we first determine the factorization of $u$
and $v$. There is no known method for finding the prime factors
of an integer very rapidly (see Section 4.5.4). But fortunately
there is an efficient way to calculate the greatest common divisor
of two integers without factoring them, and, in fact, such a
method was discovered over 2250 years ago; this is ``Euclid's
algorithm,'' which we have already examined in Sections 1.1
and 1.2.1.
Euclid's algorithm is found in Book 7, Propositions
1 and 2 of his {\sl Elements} ({\sl c}.\ 300 {\:m B.C.}),
but it probably wasn't his own invention. Scholars
believe that the method was known up to 200 years earlier, at
least in its subtractive form, and it was almost certainly known
to Eudoxus ({\sl c}.\ 375 {\:m B.C.}); cf.\
K. von Fritz, {\sl Ann.\ Math.}\ (2) {\bf 46} (1945), 242--264.
We might call it the granddaddy of all algorithms, because it
is the oldest nontrivial algorithm that has survived to the
present day.\xskip (The chief rival for this honor is perhaps the
ancient Egyptian method for multiplication, which was based
on doubling and adding, and which forms the basis for efficient
calculation of $n$th powers as explained in Section 4.6.3. But
the Egyptian manuscripts merely give examples that are not
completely systematic, and these examples were certainly not stated systematically;
the Egyptian method is therefore not quite deserving of the
name ``algorithm.'' Several ancient Babylonian methods, for
doing such things as solving special sets of quadratic equations
in two variables, are also known. Genuine algorithms are involved
in this case, not just special solutions to the equations for
certain input parameters; even though the Babylonians invariably
presented each method in conjunction with an example worked
with particular input data, they regularly explained the general
procedure in the accompanying text.\xskip [See D. E. Knuth, {\sl CACM \bf 15}
(1972), 671--677.]\xskip Many of these Babylonian algorithms predate
Euclid by 1500 years, and they are the earliest known instances
of written procedures for mathematics. But they do not have the
stature of Euclid's algorithm, since they do not involve
iteration and since they have been superseded by modern algebraic
methods.)
In view of the importance of Euclid's algorithm,
for historical as well as practical reasons, let us now consider how
Euclid himself treated it. Paraphrasing his words into modern
terminology, this is essentially what he wrote:
%folio 417 galley 2 (C) Addison-Wesley 1978 *
\ninepoint
\yyskip\hang{\bf Proposition.}\xskip {\sl
Given two positive integers, find their greatest common divisor.}
\yskip\hang Let $A$, $C$ be the two given positive integers;
it is required to find their greatest common divisor. If $C$
divides $A$, then $C$ is a common divisor of $C$ and $A$, since
it also divides itself. And it clearly is in fact the greatest,
since no greater number than $C$ will divide $C$.
\yskip\hang But if $C$ does not divide $A$, then continually
subtract the lesser of the numbers\penalty-50\ $A$, $C$ from the greater,
until some number is left that divides the previous one. This
will eventually happen, for if unity is left, it will divide
the previous number.
\yskip\hang Now let $E$ be the positive remainder of $A$
divided by $C$; let $F$ be the positive remainder of $C$ divided
by $E$; and let $F$ be a divisor of $E$. Since $F$ divides $E$
and $E$ divides $C - F$, $F$ also divides $C - F$; but it also
divides itself, so it divides $C$. And $C$ divides $A - E$;
therefore $F$ also divides $A - E$. But it also divides $E$;
therefore it divides $A$. Hence it is a common divisor of $A$
and $C$.
\yskip\hang I now claim that it is also the greatest. For
if $F$ is not the greatest common divisor of $A$ and $C$, some
larger number will divide them both. Let such a number be $G$.
\yskip\hang Now since $G$ divides $C$ while $C$ divides $A
- E$, $G$ divides $A - E$. $G$ also divides the whole of $A$, so
it divides the remainder $E$. But $E$ divides $C - F$; therefore
$G$ also divides $C - F$. And $G$ also divides the whole of $C$, so
it divides the remainder $F$; that is, a greater number divides
a smaller one. This is impossible.
\yskip\hang Therefore no number greater than $F$ will divide
$A$ and $C$, so $F$ is their greatest common divisor.
\yskip\hang {\bf Corollary.}\xskip This argument makes it
evident that any number dividing two numbers divides their greatest
common divisor. {\sl Q.E.D.}
\tenpoint\yyskip\noindent {\sl Note.}\xskip Euclid's
statements have been simplified here in one nontrivial respect:
Greek mathematicians did not regard unity as a ``divisor'' of
another positive integer. Two positive integers were either
both equal to unity, or they were relatively prime, or they had
a greatest common divisor. In fact, unity was not even considered
to be a ``number,'' and zero was of course nonexistent. These
rather awkward conventions made it necessary for Euclid to duplicate
much of his discussion, and he gave two separate propositions
that each are essentially like the one appearing here.
In his discussion, Euclid first suggests subtracting
the smaller of the two current numbers from the larger, repeatedly,
until we get two numbers in which one is a multiple of another.
But in the proof he really relies on taking the remainder of
one number divided by another; and since he has no simple concept of
zero, he cannot speak of the remainder when one number divides
the other. It is reasonable to say that he imagines each
{\sl division} (not the individual subtractions) as a single
step of the algorithm, and hence an ``authentic'' rendition
of his algorithm can be phrased as follows:
\algbegin Algorithm E (Original Euclidean algorithm).
Given two integers $A$ and $C$ greater than unity, this algorithm
finds their greatest common divisor.
\algstep E1. [$A$ divisible by $C$?] If $C$ divides
$A$, the algorithm terminates with $C$ as the answer.
\algstep E2. [Replace $A$ by remainder.] If $A\mod C$ is equal
to unity, the given numbers were relatively prime, so the algorithm
terminates. Otherwise replace the pair of values $(A, C)$ by
$(C, A \mod C)$ and return to step E1.\quad\blackslug
\yyskip The ``proof'' Euclid gave,
which is quoted above, is especially interesting because it
is not really a proof at all! He verifies the result
of the algorithm only if step E1 is performed once or thrice.
Surely he must have realized that step E1 could take place more
than three times, although he made no mention of such a possibility.
Not having the notion of a proof by mathematical induction,
he could only give a proof for a finite number of cases.\xskip (In
fact, he often proved only the case $n = 3$ of a theorem that
he wanted to establish for general $n$.)\xskip Although Euclid is
justly famous for the great advances he made in the art of logical
deduction, techniques for giving valid proofs by induction were
not discovered until many centuries later, and the crucial ideas
for proving the validity of {\sl algorithms} are only now becoming
really clear.\xskip (See Section 1.2.1 for a complete proof of Euclid's
algorithm, together with a short discussion of general proof
procedures for algorithms.)
It is worth noting that this algorithm for finding
the greatest common divisor was chosen by Euclid to be the very
first step in his development of the theory of numbers. The
same order of presentation is still in use today in modern textbooks.
Euclid also gave (Proposition 34) a method to find the least
common multiple of two integers $u$ and $v$, namely to divide
$u$ by $\gcd(u, v)$ and to multiply the result by $v$; this is equivalent
to Eq.\ (10).
If we avoid Euclid's bias against the numbers 0
and 1, we can reformulate Algorithm E in the following way:
\algbegin Algorithm A (Modern Euclidean algorithm).
Given nonnegative integers $u$ and $v$, this algorithm finds
their greatest common divisor.\xskip $\biglp${\sl Note:} The greatest
common divisor of {\sl arbitrary} integers $u$ and $v$ may be
obtained by applying this algorithm to $|u|$ and $|v|$, because
of Eqs.\ (2) and (3).$\bigrp$
\algstep A1. [$v = 0$?] If $v = 0$, the
algorithm terminates with $u$ as the answer.
\algstep A2. [Take $u \mod v$.] Set
$r ← u\mod v$, $u ← v$, $v ← r$, and return to A1.\xskip (The operations
of this step decrease the value of $v$, but they leave $\gcd(u, v)$
unchanged.)\quad\blackslug
\yyskip For example, we may calculate $\gcd(40902,
24140)$ as follows:
$$\eqalign{\gcd(40902,24140)⊗=\gcd(24140,16762) = \gcd(16762,
7378)\cr
⊗= \gcd(7378, 2006) = \gcd(2006, 1360) = \gcd(1360, 646)\cr
⊗= \gcd(646, 68) = \gcd(68, 34) = \gcd(34, 0) = 34.\cr}$$
A proof that Algorithm A is valid follows readily
from Eq.\ (4) and the fact that
$$\gcd(u, v) =\gcd(v, u - qv),\eqno (13)$$
if $q$ is any integer. Equation (13) holds because
any common divisor of $u$ and $v$ is a divisor of both $v$ and
$u - qv$, and, conversely, any common divisor of $v$ and $u
- qv$ must divide both $u$ and $v$.
The following \MIX\ program illustrates the fact
that Algorithm A can easily be implemented on a computer:
\algbegin Program A (Euclid's algorithm). Assume that
$u$ and $v$ are single-precision, nonnegative integers, stored
respectively in locations \.U and \.V; this program puts $\gcd(u,
v)$ into rA.
{\yyskip\tabskip90pt\mixfive{\!
⊗⊗LDX⊗U⊗1⊗$\rX←u$.\cr
⊗⊗JMP⊗2F⊗1\cr
⊗1H⊗STX⊗V⊗T⊗$v ←\rX$.\cr
⊗⊗SRAX⊗5⊗T⊗$\rAX←\rA$.\cr
⊗⊗DIV⊗V⊗T⊗$\rX←\rAX\mod v$.\cr
\\⊗2H⊗LDA⊗V⊗1+T⊗$\rA←v$.\cr
⊗⊗JXNZ⊗1B⊗1+T⊗Done if $\rX=0$.\quad\blackslug\cr}}
\yyskip\noindent
The running time for this program is $19T + 6$
cycles, where $T$ is the number of divisions performed. The
discussion in Section 4.5.3 shows that we may take $T = 0.842766
\ln N + 0.06$ as an approximate average value, when $u$ and $v$
are independently and uniformly distributed in the range $1
≤ u, v≤ N$.
%folio 419 galley 3 (C) Addison-Wesley 1978 *
\subsectionbegin{A binary method} Since Euclid's
patriarchal algorithm has been used for so many centuries, it
is a rather surprising fact that it may not always be the best
method for finding the greatest common divisor after all. A
quite different gcd algorithm, which is primarily suited to
binary arithmetic, was discovered by J. Stein in 1961 [see {\sl
J. Comp.\ Phys.\ \bf 1} (1967), 397--405]. This new algorithm requires
no division instruction; it relies solely on the operations of (i) subtraction,
(ii) testing whether a number is even or odd, and (iii) shifting the binary
representation of an even number to the right (halving).
The binary gcd algorithm is based on four simple facts about positive
integers $u$ and $v$:
\yskip\textindent{a)}If $u$ and $v$ are both even, then $\gcd(u,
v) = 2\gcd(u/2, v/2)$.\xskip [See Eq.\ (8).]
\textindent{b)}If $u$ is even and $v$ is odd, then $\gcd(u, v) =
\gcd(u/2, v)$.\xskip [See Eq.\ (6).]
\textindent{c)}As in Euclid's algorithm, $\gcd(u, v) = \gcd(u - v,
v)$.\xskip [See Eqs.\ (13), (2).]
\textindent{d)}If $u$ and $v$ are both odd, then $u - v$ is even,
and $|u - v| < \max(u, v)$.
\yskip\noindent These facts immediately suggest the following
algorithm:
\algbegin Algorithm B (Binary gcd algorithm). Given
positive integers $u$ and $v$, this algorithm finds their greatest
common divisor.
\algstep B1. [Find power of 2.] Set $k ←
0$, and then repeatedly set $k ← k + 1$, $u ← u/2$, $v ← v/2$ zero
or more times until $u$ and $v$ are not both even.
\algstep B2. [Initialize.] (Now the original values of $u$ and
$v$ have been divided by $2↑k$, and at least one of their present
values is odd.)\xskip If $u$ is odd, set $t ← -v$ and go to B4. Otherwise
set $t ← u$.
\algstep B3. [Halve $t$.] (At this point, $t$ is even,
and nonzero.)\xskip Set $t ← t/2$.
\algstep B4. [Is $t$ even?] If $t$ is even, go back to B3.
\algstep B5. [Reset $\max(u, v)$.] If $t > 0$, set $u ← t$;
otherwise set $v ← -t$.\xskip (The larger of $u$ and $v$ has been
replaced by $|t|$, except perhaps during the first time this
step is performed.)
\algstep B6. [Subtract.] Set $t ← u - v$. If $t ≠ 0$, go back
to B3. Otherwise the algorithm terminates with $u \cdot 2↑k$
as the output.\quad\blackslug
\topinsert{\vskip 48mm
\ctrline{\caption Fig.\ 9. Binary algorithm for the greatest common divisor.}}
\yyskip As an example of Algorithm
B\null, let us consider $u = 40902$, $v = 24140$, the same numbers
we have used to try out Euclid's algorithm. Step B1 sets $k
← 1$, $u ← 20451$, $v ← 12070$. Then $t$ is set to $-12070$, and replaced
by $-6035$; $v$ is replaced by 6035, and the computation proceeds
as follows:
$$\vbox{\halign{\hfill#⊗\qquad\hfill#⊗\qquad#\hfill\cr
$u$\hfill⊗$v$\hfill⊗\hfill$t$\cr
\noalign{\vskip 3pt}
20451⊗6035⊗$ +14416$, $+7208$, $+3604$, $+1802$, $+901$;\cr
901⊗6035⊗$-5134$, $-2567$;\cr
901⊗2567⊗$-1666$, $-833$;\cr
901⊗833⊗$+68$, $+34$, $+17$;\cr
17⊗833⊗$-816$, $-408$, $-204$, $-102$, $-51$;\cr
17⊗51⊗$-34$, $-17$;\cr
17⊗17⊗0.\cr}}$$
The answer is $17 \cdot 2↑1 = 34$. A few more iterations
were necessary here than we needed with Algorithm A\null, but each
iteration was somewhat simpler since no division steps were
used.
A \MIX\ program for Algorithm B requires just a little
more code than for Algorithm A\null. In order to make such a program
fairly typical of a binary computer's representation of Algorithm
B\null, let us assume that \MIX\ is extended to include the following
operators:
\yyskip\noindent$\bullet$ \.{SLB} (shift left AX binary).\xskip
C$\null = 6$;\xskip F$\null = 6$. \par\penalty1000\noindent
The contents of registers A and X are ``shifted
left'' M binary places; that is, $|\rAX| ← |2↑{\char'115}\rAX|\mod B↑{10}$,
where $B$ is the byte size.\xskip (As with all \MIX\ shift commands,
the signs of rA and rX are not affected.)
\yyskip\noindent$\bullet$ \.{SRB} (shift
right AX binary).\xskip
C$\null = 6$;\xskip F$\null = 7$.\par\penalty1000\noindent
The contents of registers A
and X are ``shifted right'' M binary places; that is, $|\rAX|
← \lfloor |\rAX|/2↑{\char'115}\rfloor $.
\yyskip\noindent$\bullet$ \.{JAE}, \.{JAO} (jump
A even, jump A odd).\xskip C$\null = 40$;\xskip F$\null = 6$, 7, respectively.\par
\penalty1000\noindent
A \.{JMP} occurs if rA is even or odd, respectively.
\yyskip\noindent$\bullet$ \.{JXE}, \.{JXO} (jump
X even, jump X odd).\xskip C$\null = 47$;\xskip F$\null = 6$, 7, respectively.\par
\penalty1000\noindent
Analogous to \.{JAE}, \.{JAO}.
\algbegin Program B (Binary gcd algorithm).
Assume that $u$ and $v$ are single-precision positive integers,
stored respectively in locations \.U and \.V; this program uses
Algorithm B to put $\gcd(u, v)$ into rA. Register assignments:
$t ≡ \rA$, $k ≡ \rI1$.
%folio 421 galley 4 WARNING: Much tape unreadable! (C) Addison-Wesley 1978 *
{\yyskip\tabskip 25pt \mixfive{\!
01⊗ABS⊗EQU⊗1:5\cr
02⊗B1⊗ENT1⊗0⊗1⊗\understep{B1. Find }{\sl p\hskip-3pt}\understep{\hskip 3pt
ower of 2.}\cr
03⊗⊗LDX⊗U⊗1⊗$\rX ← u$.\cr
04⊗⊗LDAN⊗V⊗1⊗$\rA ← -v$.\cr
05⊗⊗JMP⊗1F⊗1\cr
\\06⊗2H⊗SRB⊗1⊗A⊗Halve rA, rX.\cr
07⊗⊗INC1⊗1⊗A⊗$k ← k + 1$.\cr
08⊗⊗STX⊗U⊗A⊗$u ← u/2$.\cr
09⊗⊗STA⊗V(ABS)⊗A⊗$v ← v/2$.\cr
\\10⊗1H⊗JXO⊗B4⊗1+A⊗To B4 with $t←-v$ if $u$ is odd.\cr
\\11⊗B2⊗JAE⊗2B⊗B+A⊗\understep{B2. Initialize.}\cr
\\12⊗⊗LDA⊗U⊗B⊗$t←u$.\cr
\\13⊗B3⊗SRB⊗1⊗D⊗\understep{B3. Halve $t$.}\cr
\\14⊗B4⊗JAE⊗B3⊗1-B+D⊗\understep{B4. Is $t$ even?}\cr
\\15⊗B5⊗JAN⊗1F⊗C⊗\understep{B5. Reset $\max\hskip1pt$}\hskip-1pt(\understep{$u$}\!
,\hskip-1pt\understep{$\hskip1pt\,v$})\hskip-1pt\understep{\hskip1pt.}\cr
16⊗⊗STA⊗U⊗E⊗If $t>0$, set $u←t$.\cr
17⊗⊗SUB⊗V⊗E⊗$t←u-v$.\cr
18⊗⊗JMP⊗2F⊗E\cr
\\19⊗1H⊗STA⊗V(ABS)⊗C-E⊗If $t<0$, set $v←-t$.\cr
20⊗B6⊗ADD⊗U⊗C-E⊗\understep{B6. Subtract.}\cr
\\21⊗2H⊗JANZ⊗B3⊗C⊗To B3 if $t≠0$.\cr
\\22⊗⊗LDA⊗U⊗1⊗$\rA←u$.\cr
23⊗⊗ENTX⊗0⊗1⊗$\rX←0$.\cr
24⊗⊗SLB⊗0,1⊗1⊗$\rA←2↑k\cdot\rA$.\quad\blackslug\cr}}
\yyskip The running time of this program is
$$9A+2B+6C+3D+E+13$$
units, where $A=k$, $B=1$ if $t←u$ in step B2 (otherwise $B=0$), $C$ is the
number of subtraction steps, $D$ is the number of halvings in step B3, and
$E$ is the number of times $t>0$ in step B5. Calculations discussed later in
this section imply that we may take $A={1\over3}$, $B={1\over3}$, $C=0.71n-0.5$,
$D=1.41n-2.7$, $E=0.35n-0.4$ as average values for these quantities, assuming
random inputs $u$ and $v$ in the range $1≤u,v<2↑n$. The total running time is
therefore about $8.8n+5$ cycles, compared to about $11.1n+7$ for Program A under
the same assumptions. The worst possible running time for $u$ and $v$ in this
range occurs when $A=0$, $B=0$, $C=n$, $D=2n-2$, $E=1$; this amounts to
$12n+8$ cycles.\xskip (The corresponding value for Program A is $26.8n+19$.)
Thus the greater speed of the iterations in Program B, due to the simplicity of
the operations, compensates for the greater number of iterations required.
We have found that the binary algorithm is about 20 percent faster than
Euclid's algorithm on the \MIX\ computer. Of course, the situation may be
different on other computers, and in any event both programs are quite
efficient; but it appears that not even a procedure as venerable as Euclid's
algorithm can withstand progress.
V. C. Harris [{\sl Fibonacci Quarterly \bf8} (1970), 102--103] has suggested an
interesting cross between Euclid's algorithm and the binary algorithm. If
$u$ and $v$ are odd, with $u≥v>0$, we can always write $u=qv\pm r$ where
$0≤r<v$ and $r$ is even; if $r≠0$ we set $r←r/2$ until $r$ is odd, then set
$u←v$, $v←r$ and repeat the process. In subsequent iterations, $q≥3$.
\subsectionbegin{Extensions} We can extend the methods
used to calculate $\gcd(u, v)$ in order to solve some slightly
more difficult problems. For example, assume that we want to
compute the greatest common divisor of $n$ integers $u↓1$, $u↓2$,
$\ldotss$, $u↓n$.
One way to calculate $\gcd(u↓1, u↓2, \ldotss , u↓n)$,
assuming that the $u$'s are all nonnegative, is to extend Euclid's
algorithm in the following way: If all $u↓j$ are zero, the greatest
common divisor is taken to be zero; otherwise if only one $u↓j$
is nonzero, it is the greatest common divisor; otherwise replace
$u↓k$ by $u↓k \mod u↓j$ for all $k ≠ j$, where $u↓j$ is the
minimum of the nonzero $u$'s.
The algorithm sketched in the preceding paragraph
is a natural generalization of Euclid's method, and it can be
justified in a similar manner. But there is a simpler method
available, based on the easily verified identity
$$\gcd(u↓1, u↓2, \ldotss , u↓n) =\gcd\biglp u↓1,\gcd(u↓2,
\ldotss , u↓n)\bigrp .\eqno (14)$$
To calculate $\gcd(u↓1, u↓2, \ldotss , u↓n)$, we
may therefore proceed as follows:
\yskip\hang\textindent{\bf D1.}Set $d ← u↓n$, $j ← n - 1$.
\yskip\hang\textindent{\bf D2.}If $d≠1$ and $j>0$, set $d←\gcd(u↓j, d)$
and $j ← j - 1$ and repeat this step. Otherwise
$d = \gcd(u↓1, \ldotss , u↓n)$.
\yskip\noindent This method reduces the calculation
of $\gcd(u↓1, \ldotss , u↓n)$ to repeated calculations of the
greatest common divisor of two numbers at a time. It makes use
of the fact that $\gcd(u↓1, \ldotss , u↓j, 1) = 1$; and this will
be helpful, since we will already have $\gcd(u↓{n-1}, u↓n) =
1$ over 60 percent of the time if $u↓{n-1}$ and $u↓n$ are chosen
at random. In most cases, the value of $d$ will decrease rapidly during
the first few stages of the calculation, and this will make the
remainder of the computation quite fast. Here Euclid's algorithm
has an advantage over Algorithm B\null, in that its running time
is primarily governed by the value of $\min(u, v)$, while the
running time for Algorithm B is primarily governed by $\max(u,
v)$; it would be reasonable to perform one iteration of Euclid's
algorithm, replacing $u$ by $u \mod v$ if $u$ is much larger
than $v$, and then to continue with Algorithm B.
The assertion that $\gcd(u↓{n-1}, u↓n)$ will be
equal to unity more than 60 percent of the time for random inputs
is a consequence of the following well-known result of number
theory:
\algbegin Theorem D ({\rm G. Lejeune Dirichlet, {\sl Abh.\ K\"oniglich
Preu\ss.\ Akad.\ Wiss.}\ (1849), 69--83}). {\sl If $u$ and $v$ are integers
chosen at random, the probability that $\gcd(u, v) = 1$ is $6/π↑2\approx .60793$.}
\yyskip A precise formulation of this theorem, which carefully
defines what is meant by being ``chosen at random,'' appears
in exercise 10 with a rigorous proof. Let us content ourselves
here with a heuristic argument that shows why the theorem is plausible.
%folio 423 galley 5 (C) Addison-Wesley 1978 *
If we assume, without proof, the existence
of a well-defined probability $p$ that $\gcd(u, v)$ equals unity,
then we can determine the probability that $\gcd(u, v) = d$ for
any positive integer $d$; for the latter event will happen only
when $u$ is a multiple of $d$, and $v$ is a multiple of $d$, and $\gcd(u/d,
v/d) = 1$. Thus the probability that $\gcd(u, v) = d$ is equal
to $1/d$ times $1/d$ times $p$, namely $p/d↑2$. Now let us sum
these probabilities over all possible values of $d$; we should get
$$\chop to 12pt{1 = \sum ↓{d≥1} p/d↑2 = p(1 + {1\over 4} + {1\over 9} + {1\over
16} +\cdotss ).}$$
Since the sum $1 + {1\over 4} + {1\over 9}+
\cdots = H↑{(2)}↓{\!∞}$ is equal to $π↑2/6$ (cf.\ Section 1.2.7),
we need $p=6/π↑2$ in order to make this
equation come out right.\quad\blackslug
\yyskip Euclid's algorithm can be extended
in another important way: We can calculate integers $u↑\prime$
and $v↑\prime$ such that
$$uu↑\prime + vv↑\prime = \gcd(u, v)\eqno (15)$$
at the same time $\gcd(u, v)$ is being calculated.
This extension of Euclid's algorithm can be described conveniently
in vector notation:
\algbegin Algorithm X (Extended Euclid's algorithm).
Given nonnegative integers $u$ and $v$, this algorithm determines
a vector $(u↓1, u↓2, u↓3)$ such that $uu↓1 + vu↓2 = u↓3 = \gcd(u,
v)$. The computation makes use of auxiliary vectors $(v↓1, v↓2,
v↓3)$, $(t↓1, t↓2, t↓3)$; all vectors are manipulated in such
a way that the relations
$$ut↓1 + vt↓2 = t↓3,\qquad uu↓1 + vu↓2 = u↓3,\qquad uv↓1 +
vv↓2 = v↓3\eqno (16)$$
hold throughout the calculation.
\algstep X1. [Initialize.] Set $(u↓1, u↓2, u↓3) ←
(1, 0, u)$, $(v↓1, v↓2, v↓3) ← (0, 1, v)$.
\algstep X2. [Is $v↓3 = 0$?] If $v↓3 = 0$, the algorithm terminates.
\algstep X3. [Divide, subtract.] Set $q ← \lfloor u↓3/v↓3\rfloor
$, and then set
$$\baselineskip15pt
\cpile{(t↓1, t↓2, t↓3) ← (u↓1, u↓2, u↓3) - (v↓1, v↓2, v↓3)q,\cr
(u↓1, u↓2, u↓3) ← (v↓1, v↓2, v↓3),\qquad (v↓1,
v↓2, v↓3) ← (t↓1, t↓2, t↓3).\cr}$$
Return to step X2.\quad\blackslug
\yyskip\noindent For example, let $u = 40902$, $v = 24140$.
At step X2 we have
$$\vbox{\tabskip20pt\halign{\hfill#⊗\hfill$#$⊗\hfill$#$⊗\hfill#⊗\hfill$#$⊗\hfill
$#$⊗\hfill#\cr
\vbox to 7pt{}$q$⊗u↓1⊗u↓2⊗$u↓3$\hfill⊗v↓1⊗v↓2⊗$v↓3$\hfill\cr
\noalign{\vskip 3pt}
---⊗1⊗0⊗40902⊗0⊗1⊗24140\cr
1⊗0⊗1⊗24140⊗1⊗-1⊗16762\cr
1⊗1⊗-1⊗16762⊗-1⊗2⊗7378\cr
2⊗-1⊗2⊗7378⊗3⊗-5⊗2006\cr
3⊗3⊗-5⊗2006⊗-10⊗17⊗1360\cr
1⊗-10⊗17⊗1360⊗13⊗-22⊗646\cr
2⊗13⊗-22⊗646⊗-36⊗61⊗68\cr
9⊗-36⊗61⊗68⊗337⊗-571⊗34\cr
2⊗337⊗-571⊗34⊗-710⊗1203⊗0\cr}}$$
The solution is therefore $337 \cdot 40902 - 571 \cdot
24140 = 34 = \gcd(40902, 24140)$.
\yskip The validity of Algorithm X follows from
(16) and the fact that the algorithm is identical to Algorithm
A with respect to its manipulation of $u↓3$ and $v↓3$. A detailed
proof of Algorithm X is discussed in Section 1.2.1. Gordon H.
Bradley has observed that we can avoid a good deal of the calculation
in Algorithm X by suppressing $u↓2$, $v↓2$, and $t↓2$; then $u↓2$
can be determined afterwards using the relation $uu↓1 + vu↓2
= u↓3$.\xskip (It is interesting to note that this modified algorithm would
be harder to verify; we often find that the simplest way to prove an
``optimized'' algorithm correct is to show that it is equivalent to an
unoptimized algorithm whose correctness is easier to establish.)
Exercise 14 shows that the values of $|u↓1|$, $|u↓2|$,
$|v↓1|$, $|v↓2|$ remain bounded by the size of the inputs $u$ and
$v$.
Algorithm B\null, which computes the greatest common
divisor using properties of binary notation, can be extended in a similar
way; see exercise 35. For some
instructive extensions to Algorithm X\null, see exercises 18 and
19 in Section 4.6.1.
\yskip The ideas underlying Euclid's algorithm
can also be applied to find a {\sl general solution in integers}
of any set of linear equations with integer coefficients. For
example, suppose that we want to find all integers $w$, $x$, $y$,
$z$ that satisfy the two equations
$$\vbox{\tabskip0pt plus 100pt\vskip-9pt\baselineskip16pt
\halign to size{$\hfill#$\tabskip0pt⊗$\null#$⊗$\null\hfill#$⊗$\null#$⊗$\null#$\hfill
\tabskip0pt plus 100pt⊗\hfill#\tabskip0pt\cr
10w⊗+ 3x⊗+ 3y⊗+ 8z⊗= 1,⊗(17)\cr
6w⊗- 7x⊗ ⊗- 5z⊗= 2.⊗(18)\cr}}$$
We can introduce a new variable
$$\lfloor 10/3\rfloor w + \lfloor 3/3\rfloor x + \lfloor 3/3\rfloor
y + \lfloor 8/3\rfloor z = 3w + x + y + 2z = t↓1,$$
and use it to eliminate $y$; Eq.\ (17) becomes
$$\quad(10 \mod 3)w + (3 \mod 3)x + 3t↓1 + (8 \mod 3)z = w + 3t↓1
+ 2z = 1,\eqno (19)$$
and Eq.\ (18) remains unchanged. The new equation
(19) may be used to eliminate $w$, and (18) becomes
$$6(1 - 3t↓1 - 2z) - 7x - 5z = 2;$$
that is,
$$7x + 18t↓1 + 17z = 4.\eqno (20)$$
Now as before we introduce a new variable
$$x + 2t↓1 + 2z = t↓2$$
and eliminate $x$ from (20):
$$7t↓2 + 4t↓1 + 3z = 4.\eqno (21)$$
Another new variable can be introduced in the same
fashion, in order to eliminate the variable $z$, which has the
smallest coefficient:
$$2t↓2 + t↓1 + z = t↓3.$$
Eliminating $z$ from (21) yields
$$t↓2 + t↓1 + 3t↓3 = 4,\eqno (22)$$
and this equation, finally, can be used to eliminate
$t↓2$. We are left with two independent variables, $t↓1$ and
$t↓3$; substituting back for the original variables, we obtain
the general solution
$$\baselineskip14pt
\eqalign{w ⊗= \quad 17 -\9 5t↓1 - 14t↓3,\cr
x ⊗=\quad 20 - \9 5t↓1- 17t↓3,\cr
y ⊗= -55 + 19t↓1 + 45t↓3,\cr
z ⊗=\9 -8 +\quad t↓1 +\9 7t↓3.\cr}\eqno(23)$$
In other words, all integer solutions
$(w, x, y, z)$ to the original equations (17), (18) are obtained
from (23) by letting $t↓1$ and $t↓3$ independently run through
all integers.
%folio 426 galley 6 (C) Addison-Wesley 1978 *
The general method that has just been illustrated
is based on the following procedure: Find a nonzero coefficient
$c$ of smallest absolute value in the system of equations. Suppose
that this coefficient appears in an equation having the form
$$cx↓0+c↓1x↓1+\cdots+c↓kx↓k=d;\eqno(24)$$
we may assume for simplicity that $c > 0$. If
$c = 1$, use this equation to eliminate the variable $x↓0$ from
the other equations remaining in the system; then repeat the
procedure on the remaining equations.\xskip (If no more equations
remain, the computation stops, and a general solution in terms
of the variables not yet eliminated has essentially been obtained.)\xskip
If $c > 1$, then if $c↓1 \mod c =\cdots = c↓k \mod
c = 0$ we must have $d \mod c = 0$, otherwise there is no
integer solution; divide both sides of (24) by $c$ and eliminate
$x↓0$ as in the case $c = 1$. Finally, if $c > 1$ and not all
of $c↓1\mod c$, $\ldotss$, $c↓k\mod c$ are zero, then introduce
a new variable
$$\lfloor c/c\rfloor x↓0 + \lfloor c↓1/c\rfloor
x↓1 +\cdots + \lfloor c↓k/c\rfloor x↓k = t;\eqno(25)$$
eliminate the variable $x↓0$ from the other equations,
in favor of $t$, and replace the original equation (24) by
$$ct + (c↓1 \mod c)x↓1 +\cdots+ (c↓k\mod c)x↓k= d.\eqno (26)$$
$\biglp$Cf.\ (19) and (21) in the above example.$\bigrp$
This process must terminate, since each step either
reduces the number of equations or the size of the smallest
nonzero coefficient in the system. A study of the above procedure
will reveal its intimate connection with Euclid's algorithm.
The method is a comparatively simple means of solving linear
equations when the variables are required to take on integer
values only. It isn't the best available method for this problem,
however; substantial refinements are possible, but beyond the scope of
this book.
\subsectionbegin{High-precision calculation} If $u$
and $v$ are very large integers, requiring a multiple-precision
representation, the binary method (Algorithm B) is a simple
and fairly efficient means of calculating their greatest
common divisor, since it involves only subtractions and shifting.
By contrast, Euclid's algorithm seems much less
attractive, since step A2 requires a multiple-precision division
of $u$ by $v$. But this difficulty is not really as bad as it
seems, since we will prove in Section 4.5.3 that the quotient
$\lfloor u/v\rfloor$ is almost always very small; for example,
assuming random inputs, the quotient $\lfloor u/v\rfloor$ will
be less than 1000 approximately 99.856 percent of the time.
Therefore it is almost always possible to find $\lfloor u/v\rfloor$
and $(u\mod v)$ using single-precision calculations, together
with the comparatively simple operation of calculating $u -
qv$ where $q$ is a single-precision number. Furthermore, if it does
turn out that $u$ is much larger than $v$ (e.g., the initial input data
might have this form), we don't really mind having a large quotient $q$, since
Euclid's algorithm makes a great deal of progress when it replaces $u$ by $u\mod v$
in such a case.
A significant improvement in the speed of Euclid's
algorithm when high-precision numbers are involved can be achieved
by using a method due to D. H. Lehmer [{\sl AMM \bf 45}
(1938), 227--233]. Working only with the leading digits of large
numbers, it is possible to do most of the calculations with
single-precision arithmetic, and to make a substantial reduction
in the number of multiple-precision operations involved. We save a lot
of time by doing a ``virtual'' calculation instead of the actual one.
Lehmer's method can be illustrated on the eight-digit
numbers $u = 27182818$, $v = 10000000$, assuming that we are using
a machine with only four-digit words. Let $u↑\prime = 2718$,
$v↑\prime = 1001$, $u↑{\prime\prime} = 2719$, $v↑{\prime\prime} = 1000$;
then $u↑\prime/v↑\prime$ and $u↑{\prime\prime} /v↑{\prime\prime}$
are approximations to $u/v$, with
$$u↑\prime /v↑\prime < u/v < u↑{\prime\prime} /v↑{\prime\prime} .\eqno (27)$$
The ratio $u/v$ determines the sequence of quotients
obtained in Euclid's algorithm. If we carry out Euclid's algorithm
simultaneously on the single-precision values $(u↑\prime , v↑\prime
)$ and $(u↑{\prime\prime} , v↑{\prime\prime} )$ until we get a different quotient,
it is not difficult to see that the same sequence of quotients
would have appeared to this point if we had worked with the
multiple-precision numbers $(u, v)$. Thus, consider what happens
when Euclid's algorithm is applied to $(u↑\prime , v↑\prime
)$ and to $(u↑{\prime\prime} , v↑{\prime\prime} )$:
$$\vbox{\halign{\hfill#⊗\qquad\hfill#⊗\qquad\hfill#⊗\hskip80pt\hfill#⊗\qquad
\hfill#⊗\qquad\hfill#\cr
$u↑\prime$\hfill⊗$v↑\prime$\hfill⊗$q↑\prime$\hfill⊗$u↑{\prime\prime}$\hfill
⊗$v↑{\prime\prime}$\hfill⊗$q↑{\prime\prime}$\hfill\cr
\noalign{\vskip 3pt}
2718⊗1001⊗2⊗2719⊗1000⊗2\cr
1001⊗716⊗1⊗10000⊗719⊗1\cr
716⊗285⊗2⊗719⊗281⊗2\cr
285⊗146⊗1⊗281⊗157⊗1\cr
146⊗139⊗1⊗157⊗124⊗1\cr
139⊗7⊗19⊗124⊗33⊗3\cr}}$$
The first five quotients are the same in both cases, so they must be the true
ones. But on the sixth step we find that $q↑\prime ≠ q↑{\prime\prime} $,
so the single-precision calculations are suspended. We have
gained the knowledge that the calculation would have pro\-ceeded
as follows if we had been working with the original multiple-precision
numbers:
$$\vcenter{\halign{$\ctr{#}$⊗\qquad$\ctr{#}$⊗\qquad$\ctr{#}$\cr
\vbox to 12pt{}u⊗v⊗q\cr
\noalign{\vskip 3pt}
u↓0⊗v↓0⊗2\cr
v↓0⊗u↓0- 2v↓0⊗1\cr
u↓0- 2v↓0⊗ -u↓0 + 3v↓0⊗2\cr
-u↓0 + 3v↓0⊗ 3u↓0 - 8v↓0⊗1\cr
3u↓0 -8v↓0⊗ -4u↓0 +11v↓0⊗1\cr
-4u↓0 + 11v↓0⊗ 7u↓0 - 19v↓0⊗?\cr}}\eqno(28)$$
(The next quotient lies somewhere between 3 and 19.)
No matter how many digits are in $u$ and $v$, the first five
steps of Euclid's algorithm would be the same as (28), so long
as (27) holds. We can therefore avoid the multiple-precision
operations of the first five steps, and replace them all by
a multiple-precision calculation of $-4u↓0 + 11v↓0$ and $7u↓0
- 19v↓0$. In this case we obtain $u = 1268728$, $v = 279726$;
the calculation can now proceed with $u↑\prime = 1268$, $v↑\prime
= 280$, $u↑{\prime\prime} = 1269$, $v↑{\prime\prime} = 279$, etc. With a larger
accumulator, more steps could be done by single-precision calculations;
our example showed that only five cycles of Euclid's algorithm
were combined into one multiple step, but with (say) a word
size of 10 digits we could do about twelve cycles at a time.\xskip
(Results proved in Section 4.5.3 imply that the number of multiple-precision
cycles that can be replaced at each iteration is essentially
proportional to the number of digits used in the single-precision
calculations.)
Lehmer's method can be formulated as follows:
\algbegin Algorithm L (Euclid's algorithm for large numbers).
Let $u, v$ be nonnegative multiprecision integers, with $u ≥ v$.
This algorithm computes the greatest
common divisor of $u$ and $v$, making use of auxiliary single-precision
$p$-digit variables $u$, $v$, $A$, $B$, $C$, $D$, $T$, $q$, and auxiliary
multiple-precision variables $t$ and $w$.
\algstep L1. [Initialize.] If $v$ is small
enough to be represented as a single-precision value, calculate
$\gcd(u, v)$ by Algorithm A and terminate the computation. Otherwise,
let $\A u$ be the $p$ leading digits of $u$, and let $\A v$
be the corresponding digits of $v$; in other words,
if radix-$b$ notation is being used, $\A u←\lfloor u/b↑k\rfloor$ and
$\A v← \lfloor v/b↑k\rfloor $, where $k$ is as small
as possible consistent with the condition $\A u < b↑p$.
\hangindent 19pt after 0 Set $A ← 1$, $B ← 0$, $C ← 0$, $D ← 1$.\xskip (These
variables represent the coefficients in (28), where
$$u = Au↓0 + Bv↓0,\qquad v = Cu↓0 + Dv↓0,\eqno (29)$$
in the equivalent actions of Algorithm
A on multiprecision numbers. We also have
$$u↑\prime = \A u + B,\qquad v↑\prime =
\A v + D,\qquad u↑{\prime\prime} = \A u + A,\qquad v↑{\prime\prime} =
\A v + C\eqno (30)$$
in terms of the notation in the example worked above.)
\algstep L2. [Test quotient.] Set $q ← \lfloor (\A u +
A)/(\A v + C)\rfloor $. If $q ≠ \lfloor (\A u + B)/(
\A v + D)\rfloor $, go to step L4.\xskip (This step tests if $q↑\prime
≠ q↑{\prime\prime}$, in the notation of the above example. Note that
single-precision overflow can occur in special circumstances
during the computation in this step, but only when $\A u
= b↑p - 1$ and $A = 1$ or when $\A v = b↑p - 1$ and $D
= 1$; the conditions
$$\baselineskip15pt
\eqalign{0⊗≤\A u + A ≤ b↑p,\cr 0⊗≤\A u + B < b↑p,\cr}\qquad
\eqalign{0⊗≤\A v + C < b↑p,\cr 0⊗≤\A v + D ≤ b↑p\cr}\eqno(31)$$
will always hold, because of (30). It
is possible to have $\A v + C = 0$ or $\A v + D =
0$, but not both simultaneously; therefore division by zero
in this step is taken to mean ``Go directly to L4.'')
%folio 429 galley 7 (C) Addison-Wesley 1978 *
\algstep L3. [Emulate Euclid.] Set $T ← A - qC$, $A ← C$, $C ← T$,
$T ← B - qD$, $B ← D$, $D ← T$, $T ← \A u - q\A v$,
$\A u ← \A v$, $\A v ← T$, and go back to step L2.\xskip $\biglp$These
single-precision calculations are the equivalent of multiple-precision
operations, as in (28), under the conventions of (29).$\bigrp$
\algstep L4. [Multiprecision step.] If $B = 0$, set $t ← u
\mod v$, $u ← v$, $v ← t$, using multiple-precision division.\xskip (This
happens only if the single-precision operations cannot simulate
any of the multiprecision ones. It implies that Euclid's algorithm
requires a very large quotient, and this is an extremely rare
occurrence.)\xskip Otherwise, set $t ← Au$, $t ← t + Bv$, $w ← Cu$, $w ←
w + Dv$, $u ← t$, $v ← w$ (using straightforward multiprecision
operations). Go back to step L1.\quad\blackslug
\yyskip The values of $A$, $B$, $C$, $D$
remain as single-precision numbers throughout this calculation,
because of (31).
Algorithm L requires a somewhat more complicated program
than Algorithm B\null, but with large numbers it will be faster on
many computers. Algorithm B can, however, be speeded up in a
similar way (see exercise 34), to the point where it continues
to win. Algorithm L has the advantage that it can be extended,
as in Algorithm X (see exercise 17); furthermore, it determines
the sequence of quotients obtained in Euclid's algorithm, and
this yields the regular continued fraction expansion of a real
number (see exercise 4.5.3--18).
\subsectionbegin{Analysis of the binary algorithm}
Let us conclude this section by studying the running time of
Algorithm B\null, in order to justify the formulas stated earlier.
An exact determination of the behavior of Algorithm
B appears to be exceedingly difficult to derive, but we can
begin to study it by means of an approximate model of its behavior.
Suppose that $u$ and $v$ are odd numbers, with $u > v$ and
$$\lfloor\lg u\rfloor = m,\qquad \lfloor\lg v\rfloor
= n.\eqno(32)$$
$\biglp$Thus, $u$ is an $(m + 1)$-bit number, and
$v$ is an $(n + 1)$-bit number.$\bigrp$ Algorithm B forms $u
- v$ and shifts this quantity right until obtaining an odd number
$u↑\prime$ that replaces $u$. Under random conditions, we would
expect to have
$u↑\prime = (u - v)/2$
about one-half of the time,
$u↑\prime = (u - v)/4$
about one-fourth of the time,
$u↑\prime = (u - v)/8$
about one-eighth of the time, and so on. We have
$$\lfloor\lg u↑\prime \rfloor = m - k - r,\eqno (33)$$
where $k$ is the number of places that $u - v$
is shifted right, and where $r$ is $\lfloor\lg u\rfloor -
\lfloor\lg (u - v)\rfloor$, the number of bits lost at
the left during the subtraction of $v$ from $u$. Note that $r
≤ 1$ when $m ≥ n + 2$, and $r ≥ 1$ when $m = n$. For simplicity,
we will assume that $r = 0$ when $m ≠ n$ and that $r = 1$ when
$m = n$, although this lower bound tends to make $u↑\prime$
seem larger than it usually is.
The approximate model we shall use to study Algorithm
B is based solely on the values $m = \lfloor\lg u\rfloor$
and $n = \lfloor\lg v\rfloor$ throughout the course of the
algorithm, not on the actual values of $u$ and $v$. Let us call
this approximation a {\sl lattice-point model}, since we will
say that we are ``at the point $(m, n)$'' when $\lfloor\lg
u\rfloor = m$ and $\lfloor\lg v\rfloor = n$. From point $(m,
n)$ the algorithm takes us to $(m↑\prime , n)$ if $u > v$, or
to $(m, n↑\prime )$ if $u < v$, or terminates if $u = v$. For
example, the calculation starting with $u = 20451$, $v = 6035$
that is tabulated after Algorithm B begins at the point $(14,
12)$, then goes to $(9, 12)$, $(9, 11)$, $(9, 9)$, $ 4, 9)$, $(4, 5)$,
$(4, 4)$, and terminates. In line with the comments of the preceding
paragraph, we will make the following assumptions about the
probability that we reach a given point just after point $(m,
n)$:
\yyskip
\hbox to size{\vbox{\baselineskip13pt
\hbox to 14.5pc{\hfill Case 1, $m>n$.\hfill}
\vskip 2pt
\tabskip 0pt plus 100pt
\halign to 14.5pc{$\ctr{#}$⊗$\ctr{#}$\cr
\hbox{Next point}⊗\hbox{Probability}\cr
\noalign{\vskip 2pt}
(m-1,n)⊗{1\over2}\cr
(m-2,n)⊗{1\over4}\cr
\noalign{\vskip-1pt}
\cdots⊗\cdots\cr
\noalign{\vskip-1pt}
(1,n)⊗({1\over2})↑{m-1}\cr
(0,n)⊗({1\over2})↑{m-1}\cr}}\!
\vbox{\baselineskip13pt
\hbox to 14.5pc{\hfill Case 2, $m<n$.\hfill}
\vskip 2pt
\tabskip 0pt plus 100pt
\halign to 14.5pc{$\ctr{#}$⊗$\ctr{#}$\cr
\hbox{Next point}⊗\hbox{Probability}\cr
\noalign{\vskip 2pt}
(m,n-1)⊗{1\over2}\cr
(m,n-2)⊗{1\over4}\cr
\noalign{\vskip-1pt}
\cdots⊗\cdots\cr
\noalign{\vskip-1pt}
(m,1)⊗({1\over2})↑{n-1}\cr
(m,0)⊗({1\over2})↑{n-1}\cr}}}
\yyskip
\vbox{\baselineskip13pt
\hbox to size{\hfill Case 3, $m=n>0$.\hfill}
\vskip 2pt
\tabskip 0pt plus 100pt
\halign to size{$\ctr{#}$⊗$\ctr{#}$\cr
\hbox{Next point}⊗\hbox{Probability}\cr
\noalign{\vskip 2pt}
(m-2,n),\,(m,n-2)⊗{1\over4},\,{1\over4}\cr
(m-3,n),\,(m,n-3)⊗{1\over8},\,{1\over8}\cr
\noalign{\vskip-1pt}
\cdots⊗\cdots\cr
\noalign{\vskip-1pt}
(0,n),\,(m,0)⊗({1\over2})↑m,\,({1\over2})↑m\cr
\hbox{terminate}⊗\quad({1\over2})↑{m-1}\cr}}
\yyskip For example, from points $(5, 3)$ the lattice-point model
would go to points $(4, 3)$, $(3, 3)$, $(2, 3)$, $(1, 3)$, $(0, 3)$ with
the respective probabilities ${1\over 2}$, ${1\over 4}$, ${1\over
8}$, ${1\over 16}$, ${1\over 16}$; from $(4, 4)$ it would go to
$(2, 4)$, $(1, 4)$, $(0, 4)$, $(4, 2)$, $(4, 1)$, $(4, 0)$, or would terminate,
with the respective probabilities ${1\over 4}$, ${1\over 8}$,
${1\over 16}$, ${1\over 4}$, $1\over8$, ${1\over 16}$, ${1\over
8}$. When $m = n = 0$, the formulas above do not apply; the
algorithm always terminates in such a case, since $m = n = 0$
implies that $u = v = 1$.
The detailed calculations of exercise 18 show that
this lattice-point model is somewhat pessimistic. In fact, when
$m > 3$ the actual probability that Algorithm\penalty999\ B goes from $(m,
m)$ to one of the two points $(m - 2, m)$ or $(m, m - 2)$ is
equal to ${1\over 8}$, although we have assumed the value ${1\over
2}$; the algorithm actually goes from $(m, m)$ to $(m - 3, m)$
or $(m, m - 3)$ with probability ${7\over 32}$, not ${1\over
4}$; it actually goes from $(m + 1, m)$ to $(m, m)$ with probability
${1\over 8}$, not ${1\over 2}$. The probabilities in the model
are nearly correct when $|m - n|$ is large, but when $|m - n|$
is small the model predicts slower convergence than is actually
obtained. In spite of the fact that our model is not a completely
faithful representation of Algorithm B\null, it has one great virtue,
namely that it can be completely analyzed! Furthermore, empirical
experiments with Algorithm B show that the behavior predicted
by the lattice-point model is analogous to the true behavior.
An analysis of the lattice-point model
can be carried out by solving the following rather complicated
set of double recurrence relations:
$$\vcenter{\baselineskip22pt\halign{$\rt{#}$⊗$\dispstyle\null#,$\hfill\qquad
⊗if $#\hfill$\cr
A↓{mm}⊗= a + {1\over 2}A↓{m(m-2)} +\cdots
+ {1\over 2↑{m-1}} A↓{m0} + {b\over 2↑{m-1}}⊗m≥ 1;\cr
\noalign{\vskip3pt}
A↓{mn}⊗= c + {1\over 2}A↓{(m-1)n} +\cdots
+ {1\over 2↑{m-1}} A↓{1n} + {1\over 2↑{m-1}} A↓{0n}⊗m > n ≥ 0;\cr
A↓{mn}⊗= A↓{nm}⊗n > m ≥ 0.\cr}}\eqno(34)$$
The problem is to solve for $A↓{mn}$ in terms
of $m$, $n$, and the parameters $a$, $b$, $c$ and $A↓{00}$. This is
an interesting set of recurrence equations, which have an interesting
solution.
First we observe that if $0 ≤ n < m$,
$$\eqalign{A↓{(m+1)n} ⊗= c + \sum ↓{1≤k≤m}2↑{-k}A↓{(m+1-k)n}
+ 2↑{-m}A↓{0n}\cr
⊗= c + {\textstyle{1\over 2}}A↓{mn} + \sum ↓{1≤k<m}2↑{-k-1}A↓{(m-k)n}
+ 2↑{-m}A↓{0n}\cr
⊗\textstyle= c + {1\over 2}A↓{mn} + {1\over 2}(A↓{mn} - c)\cr
\noalign{\vskip 4pt}
⊗\textstyle= {1\over 2}c+ A↓{mn}.\cr}$$
Hence $A↓{(m+k)n}={1\over2}ck+A↓{mn}$, by induction on $k$. In particular,
since $A↓{10}=c+A↓{00}$, we have
$$\textstyle A↓{m0}={1\over2}c(m+1)+A↓{00},\qquad m>0.\eqno(35)$$
%folio 432 galley 8 (C) Addison-Wesley 1978 *
Now let $A↓m = A↓{mm}$. If $m > 0$, we have
$$\eqalign{A↓{(m+1)m}⊗ = c + \sum ↓{1≤k≤m+1} 2↑{-k}A↓{(m+1-k)m}
+ 2↑{-m-1}A↓{0m}\cr
⊗= c + {1\over 2}A↓{mm} + \sum ↓{1≤k≤m}
\biglp 2↑{-k-1}(A↓{(m-k)(m+1)} - c/2)\bigrp + 2↑{-m-1}A↓{0m}\cr
⊗\textstyle= c + {1\over 2}A↓m + {1\over 2}(A↓{(m+1)(m+1)} -
a - 2↑{-m}b) - {1\over 4}c(1 - 2↑{-m})\cr
\noalign{\vskip 2pt}
⊗\textstyle\hskip 60mm\null+ 2↑{-m-1}\biglp{1\over2}c(m + 1) + A↓{00}\bigrp\cr
\noalign{\vskip 5pt}
⊗\textstyle= {1\over 2}(A↓m + A↓{m+1}) + {3\over 4}c
- {1\over 2}a + 2↑{-m-1}(c - b + A↓{00}) + m2↑{-m-2}c.\cr}\eqno(36)$$
Similar maneuvering, as shown in exercise 19,
establishes the relation
$$\textstyle A↓{n+1} = {3\over 4}A↓n + {1\over 4}A↓{n-1} + α + 2↑{-n-1}β
+ (n + 2)2↑{-n-1}\gamma ,\quad n ≥ 2,\eqno (37)$$
where
$$\textstyle α = {1\over 4}a + {7\over 8}c,\qquad β = A↓{00} - b - {3\over
2}c,\qquad\hbox{and}\qquad \gamma = {1\over 2}c.$$
Thus the double recurrence (34) can be transformed
into the single recurrence relation in (37). Use of the generating
function $G(z) = A↓0 + A↓1z + A↓2z↑2 +\cdots$ now
transforms (37) into the equation
$${\textstyle(1 - {3\over 4}z - {1\over 4}z↑2)}G(z) = a↓0 + a↓1z +
a↓2z↑2 + {α\over 1 - z} + {β\over 1 - z/2} + {\gamma \over (1
- z/2)↑2},\eqno(38)$$
where $a↓0$, $a↓1$, and $a↓2$ are constants that
can be determined by the values of $A↓0$, $A↓1$, and $A↓2$. Since
$1 - {3\over 4}z + {1\over 4}z↑2 = (1 + {1\over 4}z)(1 - z)$,
we can express $G(z)$ by the method of partial fractions in
the form
$$G(z) = b↓0 + b↓1z + {b↓2\over (1 - z)↑2} + {b↓3\over 1 -
z} + {b↓4\over (1 - z/2)↑2} + {b↓5\over 1 - z/2} + {b↓6\over
1 + z/4} .$$
Tedious manipulations produce the values of these
constants $b↓0$, $\ldotss$, $b↓6$, and therefore the coefficients
of $G(z)$ are determined. We finally obtain the solution
$$\baselineskip15pt
\eqalignno{A↓{nn} ⊗\textstyle= n({1\over 5}a + {7\over 10}c) + ({16\over
25}a + {2\over 5}b - {23\over 50}c + {3\over 5}A↓{00})\cr
⊗\textstyle\quad\null + 2↑{-n}(-{1\over 3}cn + {2\over 3}b - {1\over 9}c
- {2\over 3}A↓{00})\cr
⊗\textstyle\quad\null + (-{1\over 4})↑n(-{16\over 25}a - {16\over 15}b +
{16\over 225}c + {16\over 15}A↓{00}) + {1\over 2}c\delta ↓{n0};\cr
\noalign{\vskip 3pt}
A↓{mn} ⊗\textstyle= {1\over2}mc + n({1\over 5}a + {1\over 5}c)
+ ({6\over 25}a + {2\over 5}b + {7\over 50}c + {3\over 5}A↓{00})
+ 2↑{-n}({1\over 3}c)\cr
⊗\textstyle\quad\null + (-{1\over 4})↑n(-{6\over 25}a - {2\over 5}b + {2\over
75}c + {2\over 5}A↓{00}),\qquad m > n.⊗(39)\cr}$$
With these elaborate calculations done, we can
readily determine the behavior of the lattice-point model. Assume
that the inputs $u$ and $v$ to the algorithm are odd, and let
$m = \lfloor\lg u\rfloor$, $n = \lfloor\lg v\rfloor $.
The average number of subtraction cycles, namely the quantity
$C$ in the analysis of Program B\null,
is obtained by setting $a = 1$, $b = 0$, $c = 1$, $A↓{00}
= 1$ in the recurrence (34). By (39) we see that (for $m ≥ n$)
the lattice model predicts
$$\textstyle C = {1\over 2}m + {2\over 5}n + {49\over 50} - {1\over
5}\delta ↓{mn}\eqno (40)$$
subtraction cycles, plus terms that rapidly go
to zero as $n$ approaches infinity.
The average number of times that $\gcd(u, v) = 1$ is
obtained by setting $a = b = c = 0$, $A↓{00} = 1$; this gives
the probability that $u$ and $v$ are relatively prime, approximately
${3\over 5}$. Actually, since $u$ and $v$ are assumed to be
odd, they should be relatively prime with probability $8/π↑2$
(see exercise 13), so this reflects the degree of inaccuracy
of the lattice-point model.
The average number of times that a path from $(m, n)$
goes through one of the ``diagonal'' points $(m↑\prime , m↑\prime
)$ for some $m↑\prime ≥ 1$ is obtained by setting $a = 1$, $b
= c = A↓{00} = 0$ in (34); so we find that this quantity is
approximately
$$\textstyle{1\over 5}n + {6\over 25} + {2\over 5}\delta ↓{mn},\qquad
\hbox{when }m ≥ n.$$
Now we can determine the average number of shifts,
the number of times step B3 is performed.\xskip (This is the quantity
$D$ in Program B.)\xskip In any execution of Algorithm B\null, with $u$
and $v$ both odd, the corresponding path in the lattice model
must satisfy the relation
$$\hbox{number of shifts} + \hbox{number of diagonal points} + 2\lfloor\lg
\gcd(u, v)\rfloor = m + n,$$
since we are assuming that $r$ in (33) is always
either 0 or 1. The average value of $\lfloor\lg\gcd(u, v)\rfloor$
predicted by the lattice-point model is approximately ${4\over
5}$ (see exercise 20). Therefore we have, for the total number
of shifts,
$$\eqalign{D⊗\textstyle= m + n - ({1\over 5}n + {6\over 25} + {2\over 5}\delta
↓{mn}) -{4\over 5}\cr
\noalign{\vskip4pt}
⊗=\textstyle m + {4\over 5}n - {46\over 25}- {2\over 5}\delta ↓{mn},\cr}\eqno (41)$$
when $m ≥ n$, plus terms that decrease rapidly to zero
for large $n$.
To summarize the most important facts we have derived
from the lattice-point model, we have shown that if $u$ and
$v$ are odd and if $\lfloor\lg u\rfloor = m$, $\lfloor\lg
v\rfloor = n$, then the quantities $C$ and $D$ that are the
critical factors in the running time of Program B will have
average values given by
$$\textstyle C = {1\over 2}m + {2\over 5}n + O(1),\qquad D = m + {4\over
5}n + O(1),\qquad m ≥ n.\eqno (42)$$
But the model that we have used to derive (42)
is only a pessimistic approximation to the true behavior; Table
1 compares the true average values of $C$, computed by actually
running Algorithm B with all possible inputs, to the values
predicted by the lattice-point model, for small $m$ and $n$.
The lattice model is completely accurate when $m$ or $n$ is
zero, but it tends to be
less accurate when $|m - n|$ is small and $\min(m, n)$ is
large. When $m = n = 9$, the lattice-point model gives $C =
8.78$, compared to the true value $C = 7.58$.
\topinsert{\tablehead{Table 1}
\yskip
\ctrline{NUMBER OF SUBTRACTIONS IN ALGORITHM B}
\yyskip
\baselineskip 0pt
\def\\{\lower 2.5pt\vbox to 11pt{}}
\halign to size{#\hfill\quad⊗#\tabskip0pt plus100pt⊗#\tabskip0pt⊗#⊗\quad\hfill#\cr
\lower 3pt\null⊗\hfill$n$\hfill⊗⊗\hfill$n$\hfill\cr
⊗\leaders\hrule\hfill⊗⊗\leaders\hrule\hfill\cr
\lower 3pt\vbox to 15pt{}\!
⊗\hfill0\hfill\quad\hfill1\hfill\quad\hfill2\hfill\quad\hfill3\hfill\quad
\hfill4\hfill\quad\hfill5\hfill
⊗⊗\hfill0\hfill\quad\hfill1\hfill\quad\hfill2\hfill\quad\hfill3\hfill\quad
\hfill4\hfill\quad\hfill5\hfill\cr
\\0⊗1.00\quad2.00\quad2.50\quad3.00\quad3.50\quad4.00⊗\!
⊗1.00\quad2.00\quad2.50\quad3.00\quad3.50\quad4.00⊗0\cr
\\1⊗2.00\quad1.00\quad2.50\quad3.00\quad3.50\quad4.00⊗\!
⊗2.00\quad1.00\quad3.00\quad2.75\quad3.63\quad3.94⊗1\cr
\\2⊗2.50\quad2.50\quad2.25\quad3.38\quad3.88\quad4.38⊗\!
⊗2.50\quad3.00\quad2.00\quad3.50\quad3.88\quad4.25⊗2\cr
\\3⊗3.00\quad3.00\quad3.38\quad3.25\quad4.22\quad4.72⊗\!
⊗3.00\quad2.75\quad3.50\quad2.88\quad4.13\quad4.34⊗3\cr
\\4⊗3.50\quad3.50\quad3.88\quad4.22\quad4.25\quad5.10⊗\!
⊗3.50\quad3.63\quad3.88\quad4.13\quad3.94\quad4.80⊗4\cr
\\5⊗4.00\quad4.00\quad4.38\quad4.72\quad5.10\quad5.19⊗\!
⊗4.00\quad3.94\quad4.25\quad4.34\quad4.80\quad4.60⊗5\cr
$m$\lower 7pt\vbox to 15pt{}⊗\hfill Predicted by model\hfill
⊗⊗\hfill Actual average values\hfill⊗$m$\cr
⊗\leaders\hrule\hfill⊗⊗\leaders\hrule\hfill\cr}}
%folio 434 galley 9 (C) Addison-Wesley 1978 *
Empirical tests of Algorithm B with several
million random inputs and with various values of $m, n$ in the
range $29 ≤ m, n ≤ 37$ indicate that the actual average behavior
of the algorithm is given by
$$\baselineskip 15pt
\rpile{C\approx\null\cr D\approx\null\cr}
\rpile{\textstyle{1\over 2}m\cr m\cr}
\lpile{\null + 0.203n\cr \null+0.41n\cr}
\lpile{\null + 1.9 - 0.4(0.6)↑{m-n},\cr \null - 0.5 - 0.7(0.6)↑{m-n},\cr}
\qquad m ≥ n.\eqno (43)$$
These experiments showed a rather small standard
deviation from the observed average values. The coefficients
${1\over 2}$ and 1 of $m$ in (42) and (43) can be verified rigorously
without using the lattice-point approximation (see exercise
21); so the error in the lattice-point model is apparently in
the coefficient of $n$, which is too high.
The above calculations have been made under the
assumption that $u$ and $v$ are odd and in the ranges $2↑m ≤
u < 2↑{m+1}$, $2↑n ≤ v < 2↑{n+1}$. If we say instead that $u$
and $v$ are to be {\sl any} integers, independently and uniformly
distributed over the ranges
$$1 ≤ u < 2↑N,\qquad 1 ≤ v < 2↑N,$$
then we can calculate the average values of $C$
and $D$ from the data already given; in fact, if $C↓{mn}$ denotes
the average value of $C$ under our earlier assumptions, exercise
22 shows that we have
$$\twoline{\hskip-10pt(2↑N - 1)↑2C = N↑2C↓{00}+2N\hskip-2pt\sum↓{1≤n≤N}\hskip-2pt
(N-n)2↑{n-1}
C↓{n0}}{0pt}{\chop to 12pt{\null+
2\hskip-6pt\sum↓{1≤n<m≤N}\hskip-6pt(N-m)(N-n)2↑{m+n-2}C↓{mn}
+\hskip-4pt\sum↓{1≤n≤N}\hskip-2pt(N-n)↑22↑{2n-2}C↓{nn}.\9(44)\hskip-10pt}}$$
The same formula holds for $D$ in terms of $D↓{mn}$.
If the indicated sums are carried out using the approximations
in (43), we obtain
$$C \approx 0.70N + 0(1),\qquad D \approx 1.41N + O(1).$$
(See exercise 23.) This agrees perfectly with the
results of further empirical tests, made on several million
random inputs for $N ≤ 30$; the latter tests show that we may
take
$$C = 0.70N - 0.5,\qquad D = 1.41N - 2.7\eqno (45)$$
as good estimates of the values, given this distribution
of the inputs $u$ and $v$.
Richard Brent has discovered a continuous model
that accounts for the leading terms in (45). Let us assume
that $u$ and $v$ are large, and that the number of shifts per
iteration has the value $d$ with probability exactly $2↑d$.
If we let $X = u/v$, the effect of steps B3--B5 is to replace
$X$ by $(X - 1)/2↑d$ if $X > 1$, or by $2↑d/(X↑{-1} - 1)$ if $X <
1$. The random variable $X$ has a limiting distribution that
makes it possible to analyze the average value of the ratio
by which $\max(u, v)$ decreases at each iteration; see exercise
25. Numerical calculations show that this maximum decreases
by $β = 0.705971246102$ bits per iteration; the agreement with
experiment is so good that Brent's constant $β$ must be the
true value of the number ``0.70'' in (45), and we should replace
0.203 by 0.206 in (43).\xskip [See {\sl Algorithms and Complexity}, ed.\ by
J. F. Traub (New York: Academic Press, 1976), 321--355.]
This completes our analysis of the average values
of $C$ and $D$. The other three quantities appearing in the
running time of Algorithm B are rather easily analyzed; see
exercises 6, 7, and 8.
Thus we know approximately how Algorithm B behaves
on the average. Let us now consider a ``worst case'' analysis:
What values of $u$ and $v$ are in some sense the hardest to handle?
If we assume as before that
$$\lfloor\lg u\rfloor = m\qquad\hbox{and}\qquad \lfloor
\lg v\rfloor = n,$$
let us try to find $(u, v)$ that make the algorithm
run most slowly. In view of the fact that the subtractions take
somewhat longer than the shifts, when the auxiliary bookkeeping
is considered, this question may be rephrased by asking for the inputs
$u$ and $v$ that require most subtractions. The answer is somewhat
surprising; the maximum value of $C$ is exactly
$$\max(m, n) + 1,\eqno (46)$$
although the lattice-point model would predict
that substantially higher values of $C$ are possible (see exercise
26). The derivation of the worst case (46) is quite interesting,
so it has been left as an amusing problem for the reader to
work out by himself (see exercises 27, 28).
\exbegin{EXERCISES}
\exno 1. [M21] How can
(8), (9), (10), (11), and (12) be derived easily from (6) and
(7)?
\exno 2. [M22] Given that $u$ divides $v↓1v↓2 \ldotsm v↓n$, prove
that $u$ divides
$$\gcd(u, v↓1)\gcd(u, v↓2) \ldotsm\gcd(u, v↓n).$$
\exno 3. [M23] Show that the number
of ordered pairs of positive integers $(u, v)$ such that $\lcm(u,
v) = n$ is the number of divisors of $n↑2$.
\exno 4. [M21] Given positive integers
$u$ and $v$, show that there are divisors $u↑\prime$ of $u$
and $v↑\prime$ of $v$ such that $\gcd(u↑\prime , v↑\prime ) =
1$ and $u↑\prime v↑\prime = \lcm(u, v)$.
\trexno 5. [M26] Invent an algorithm
(analogous to Algorithm B) for calculating the greatest common
divisor of two integers based on their {\sl balanced ternary}
representation. Dem\-on\-strate your algorithm by applying it to
the calculation of $\gcd(40902,24140)$.
\exno 6. [M22] Given that $u$ and $v$ are random positive integers,
find the mean and standard deviation of the quantity $A$
that enters into the timing of Program B.\xskip (This is the
number of right shifts applied to both $u$ and $v$ during the preparatory
phase.)
\exno 7. [M20] Analyze the quantity $B$ that enters into the
timing of Program B.
\trexno 8. [M25] Show that in Program B\null, the average value of
$E$ is approximately equal to ${1\over 2}C↓{\hbox{\:e ave}}$, where
$C↓{\hbox{\:e ave}}$ is the average value of $C$.
\exno 9. [18] Using Algorithm B
and hand calculation, find $\gcd(31408,2718)$. Also find integers
$m$ and $n$ such that $31408m + 2718n = \gcd(31408,2718)$, using
Algorithm X.
\trexno 10. [HM24] Let $q↓n$ be the number of ordered pairs of
integers $(u, v)$ such that $1 ≤ u, v ≤ n$ and $\gcd(u, v) =
1$. The object of this exercise is to prove that $\lim↓{n→∞}
q↓n/n↑2 = 6/π↑2$, thereby establishing Theorem D\null.
\yskip\hang\textindent{a)}Use the principle of inclusion and exclusion
(Section 1.3.3) to show that
$$q↓n = n↑2 - \sum ↓{p↓1}\lfloor n/p↓1\rfloor ↑2 + \sum ↓{p↓1<p↓2}
\lfloor n/p↓1p↓2\rfloor ↑2 - \cdotss,$$
where the sums are taken over all {\sl prime} numbers
$p↓i$.
\hang\textindent{b)}The {\sl M\"obius function} $\mu (n)$ is defined
by the rules $\mu (1) = 1$, $\mu (p↓1p↓2 \ldotsm p↓r) = (-1)↑r$
if $p↓1$, $p↓2$, $\ldotss$, $p↓r$ are distinct primes, and $\mu (n)
= 0$ if $n$ is divisible by the square of a prime. Show that
$q↓n = \sum ↓{k≥1} \mu (k)\lfloor n/k\rfloor ↑2$.
\hang\textindent{c)}As a consequence of (b), prove that $\lim↓{n→∞}
q↓n/n↑2 = \sum ↓{k≥1}\mu (k)/k↑2$.
\hang\textindent{d)}Prove that $\biglp\sum ↓{k≥1}\mu (k)/k↑2\bigrp\biglp\sum ↓{m≥1}
1/m↑2\bigrp= 1$.\xskip {\sl Hint:} When the series are absolutely convergent
we have
$$\bigglp\sum ↓{k≥1} a↓k/k↑2\biggrp\bigglp\sum ↓{m≥1} b↓m/m↑2\biggrp = \sum
↓{n≥1}\bigglp\sum ↓{d\rslash n} a↓db↓{n/d}\left.\biggrp\right/n↑2.$$
\exno 11. [M22] What is the probability
that $\gcd(u, v) ≤ 3$?\xskip (See Theorem C\null.)\xskip What is the {\sl average}
value of $\gcd(u, v)$?
\exno 12. [M24] (E. Ces\`aro.)\xskip If $u$ and $v$ are random
positive integers, what is the average number of (positive)
divisors they have in common?\xskip [{\sl Hint:} See the identity
in exercise 10(d), with $a↓k = b↓m = 1$.]
%folio 439 galley 10 WARNING: Some bad spots. (C) Addison-Wesley 1978 *
\exno 13. [HM23] Given that $u$
and $v$ are random {\sl odd\/} positive integers, show that they
are relatively prime with probability $8/π↑2$.
\exno 14. [M26] What are the values of $v↓1$ and $v↓2$ when
Algorithm X terminates?
\trexno 15. [M22] Design an algorithm to {\sl divide $u$ by $v$ modulo
$m$}, given positive integers $u$, $v$, and $m$, with $v$ relatively
prime to $m$. In other words, your algorithm should find $w$,
in the range $0 ≤ w < m$, such that $u ≡ vw\modulo m$.
\exno 16. [21] Use the text's method to find a general solution
in integers to the following sets of equations:
$$\rpile{\hbox{a)}\quad 3x + 7y + 11z = 1\cr
5x+7y-\95z=3\cr}\hskip 80pt
\rpile{\hbox{b)}\quad 3x + 7y + 11z =\quad 1\cr
5x + 7y -\95z = -3\cr}$$
\trexno 17. [M24] Show how Algorithm L
can be extended (as Algorithm A was extended to Algorithm X)
to obtain solutions of (15) when $u$ and $v$ are large.
\exno 18. [M37] Let $u$ and $v$ be odd integers, independently
and uniformly distributed in the ranges $2↑m ≤ u < 2↑{m+1}$,
$2↑n ≤ v < 2↑{n+1}$. What is the {\sl exact} probability that
a single ``subtract and shift'' cycle in Algorithm B\null, namely
an operation that starts at step B6 and then stops after step
B5 is finished, reduces $u$ and $v$ to the ranges $2↑{m↑\prime} ≤
u < 2↑{m↑\prime+1}$, $2↑{n↑\prime} ≤ v < 2↑{n↑\prime+1}$, as a function of $m$, $n$,
$m↑\prime$, and $n↑\prime$?\xskip (This exercise gives more accurate
values for the transition probabilities than the text's model does.)
\exno 19. [M24] Complete the text's derivation of (38), by establishing
(37).
\exno 20. [M26] Let $λ = \lfloor\lg\gcd(u, v)\rfloor$. Show
that the lattice-point model gives $λ = 1$ with probability
${1\over 5}$, $λ = 2$ with probability ${1\over 10}$, $λ = 3$
with probability ${1\over 20}$, etc., plus correction terms
that go rapidly to zero as $u$ and $v$ approach infinity; hence
the average value of $λ$ is approximately ${4\over 5}$.\xskip
[{\sl Hint:} Consider the relation between the probability of
a path from $(m, n)$ to $(k + 1, k + 1)$ and a corresponding
path from $(m - k, n - k)$ to $(1, 1)$.]
\exno 21. [HM26] Let $C↓{mn}$ and $D↓{mn}$ be
the average number of subtraction and shift cycles, respectively,
in Algorithm B\null, when $u$ and $v$ are odd, $\lfloor\lg u\rfloor
= m$, $\lfloor\lg v\rfloor = n$. Show that for fixed $n$, $C↓{mn}
= {1\over 2}m + O(1)$ and $D↓{mn} = m + O(1)$ as $m → ∞$.
\exno 22. [23] Prove Eq.\ (44).
\exno 23. [M28] Show that if $C↓{mn} = αm + βn + \gamma$ for
some constants $α$, $β$, and $\gamma $, then
$$\eqalign{\sum ↓{1≤n≤m≤N}(N - m)(N - n)2↑{m+n-2}C↓{mn}⊗ = 2↑{2N}
\textstyle\biglp {11\over 27}(α + β)N + O(1)\bigrp ,\cr
\chop to 11pt{\sum ↓{1≤n≤N}(N - n)↑22↑{2n-2}C↓{nn}}⊗= 2↑{2N}\textstyle\biglp {5\over
27}(α + β)N + O(1)\bigrp.\cr}$$
\trexno 24. [M30] If $v = 1$ but $u$ is large, during
Algorithm B\null, it may take fairly long for
the algorithm to determine that $\gcd(u, v) = 1$. Perhaps it
would be worth while to add a test at the beginning of step
B5: ``If $t = 1$, the algorithm terminates with $2↑k$ as the
answer.'' Explore the question of whether or not this would be an
improvement when the algorithm deals with random inputs, by determining
the average number of times that step B6 is executed with $u=1$ or $v=1$,
using the lattice-point model.
\trexno 25. [M26] (R. P. Brent.)\xskip Let $u↓n$ and $v↓n$ be the values of
$u$ and $v$ after $n$ iterations of steps B3--B5; let $X↓n =
u↓n/v↓n$, and assume that $F↓n(x)$ is the probability that $X↓n
≤ x$, for $0 ≤ x < ∞$.\xskip (a) Express $F↓{n+1}(x)$ in terms of
$F↓n(x)$, under the assumption that step B4 always branches
to B3 with independent probability ${1\over 2}$.\xskip (b) Let $G↓n(x) = F↓n(x)
+ 1 - F↓n(x↑{-1})$ be the probability that $Y↓n ≤ x$, for $0
≤ x ≤ 1$, where $Y↓n =\min(u↓n, v↓n)/\!\max(u↓n, v↓n)$. Express
$G↓{n+1}$ in terms of $G↓n$.\xskip (c) Express $H↓n(x) =\hbox{probability that}
\biglp \max(u↓{n+1},v↓{n+1})/\!\max(u↓n, v↓n) < x\bigrp$ in terms of $G↓n$.
\exno 26. [M23] What is the length of the longest path from
$(m, n)$ to $(0, 0)$ in the lattice-point model?
\trexno 27. [M28] Given $m≥n≥1$, find values of $u$, $v$ with $\lfloor\lg u\rfloor
= m$, $\lfloor\lg v\rfloor = n$ such that Algorithm B requires
$m + 1$ subtraction steps.
\exno 28. [M37] Prove that the subtraction step B6 of Algorithm B is never executed
more than $1+\lfloor\lg\max(u,v)\rfloor$ times.
\exno 29. [M30] Evaluate the determinant
$$\left|\,\vcenter{\halign{$\ctr{#}$\quad⊗$\ctr{#}$\quad⊗$\ctr{#}$\quad⊗$\ctr{#}$\cr
\gcd(1,1)⊗\gcd(1,2)⊗\ldots⊗\gcd(1,n)\cr
\gcd(2,1)⊗\gcd(2,2)⊗\ldots⊗\gcd(2,n)\cr
\noalign{\vskip-2pt}
\vdots⊗\vdots⊗⊗\vdots\cr
\gcd(n,1)⊗\gcd(n,2)⊗\ldots⊗\gcd(n,n)\cr}}\,\right|.$$
\exno 30. [M25] Show that Euclid's algorithm (Algorithm A)
applied to two $n$-bit binary numbers
requires $O(n↑2)$ unts of time, as $n→∞$.\xskip(The same upper bound obviously holds
for Algorithm B.)
\exno 31. [M22] Use Euclid's algorithm to find a simple formula for $\gcd(2↑m-1,
2↑n-1)$ when $m$ and $n$ are nonnegative integers.
\exno 32. [M43] Can the upper bound $O(n↑2)$ in exercise 30 be decreased, if
another algorithm for calculating the greatest common divisor is used?
\exno 33. [M46] Analyze V. C. Harris's ``binary Euclidean algorithm.''
\trexno 34. [M32] (R. W. Gosper.)\xskip Demonstrate how
to modify Algorithm B for large numbers, using ideas analogous
to those in Algorithm L.
\trexno 35. [M28] (V. R. Pratt.)\xskip Extend Algorithm B to an Algorithm Y that is
analogous to Algorithm X.
\exno 36. [HM49] Find a rigorous proof that Brent's model describes the asymptotic
behavior of Algorithm B.
%folio 441 galley 11 (C) Addison-Wesley 1978 *
\def\bslash{\char'477 } % boldface slash (vol. 2 only)
\runningrighthead{ANALYSIS OF EUCLID{\:a'}S ALGORITHM}
\section{4.5.3}
\sectionskip
\sectionbegin{\star4.5.3. Analysis of Euclid's Algorithm}
The execution time of Euclid's algorithm depends
on $T$, the number of times the division step A2 is performed.\xskip
(See Algorithm 4.5.2A and Program 4.5.2A.)\xskip The quantity
$T$ is also an important factor in the running time of other algorithms,
such as the evaluation of functions satisfying a reciprocity
formula (see Section 3.3.3). We shall see in this section that
the mathematical analysis of this quantity $T$ is interesting
and instructive.
\subsectionbegin{Relation to continued fractions}
Euclid's algorithm is intimately connected with {\sl continued
fractions}, which are expressions of the form
$$\eqalignno{{b↓1\hfill\over\dispstyle
a↓1+{b↓2\hfill\over\dispstyle a↓2+{b↓3\hfill\over\dispstyle\cdotss{
\vbox to 6.944pt{}
\over \dispstyle a↓{n-1}+{b↓n\over a↓n}}}}}=b↓1/\biglp a↓1+b↓2/(a↓2+b↓3/(\cdotss/
(a↓{n-1}+b↓n/a↓n)\ldotsm))\bigrp.⊗⊗\lower 30pt\hbox{(1)}\cr}$$
Continued fractions have a beautiful theory that
is the subject of several books. [See, for example, O. Perron, {\sl Die Lehre
von den Kettenbr\"uchen}, 3rd ed.\ (Stuttgart: Teubner,
1954), 2 vols.; A. Khinchin, {\sl Continued Fractions}, tr.\
by Peter Wynn (Groningen: P. Noordhoff, 1963); H. S. Wall, {\sl
Analytic Theory of Continued Fractions} (New York: Van Nostrand,
1948); and see also J. Tropfke, {\sl Geschichte der Elementar-Mathematik
\bf 6} (Berlin: Gruyter, 1924), 74--84, for the early history
of the subject.] It is necessary to limit ourselves to a comparatively
brief treatment of the theory here, studying only those aspects
that give us more insight into the behavior of Euclid's algorithm.
The continued fractions of primary interest
to us are those in which all the $b$'s in (1) are equal to unity.
For convenience in notation, let us define
$$\bslash x↓1, x↓2, \ldotss , x↓n\bslash = 1/\biglp x↓1 +
1/(x↓2 + 1/(\cdots + 1/(x↓n) \ldotsm ))\bigrp .\eqno
(2)$$
Thus, for example,
$$\bslash x↓1\bslash = {1\over x↓1} ,\qquad \bslash x↓1,
x↓2\bslash = {1\over x↓1 + 1/(x↓2)} = {x↓2\over x↓1x↓2 + 1}
.\eqno (3)$$
If $n = 0$, the symbol $\bslash x↓1, \ldotss ,
x↓n\bslash$ is taken to mean 0.
Let us also define the polynomials $Q↓n(x↓1, x↓2,
\ldotss , x↓n)$ of $n$ variables, for $n ≥ 0$, by the rule
$$Q↓n(x↓1,x↓2,\ldotss,x↓n)=\left\{\vcenter{\baselineskip16pt
\halign{$#,\hfill$\quad⊗if $#\hfill$\cr
1⊗n = 0;\cr
x↓1⊗n=1;\cr
x↓1Q↓{n-1}(x↓2, \ldotss , x↓n) + Q↓{n-2}(x↓3,
\ldotss , x↓n)⊗n > 1.\cr}}\right.\eqno(4)$$
Thus $Q↓2(x↓1, x↓2) = x↓1x↓2 + 1, Q↓3(x↓1, x↓2,
x↓3) = x↓1x↓2x↓3 + x↓1 + x↓3$, etc. In general, as noted by
L. Euler in the eighteenth century, $Q↓n(x↓1, x↓2, \ldotss ,
x↓n)$ is the sum of all terms obtainable by starting with $x↓1x↓2
\ldotsm x↓n$ and deleting zero or more non\-overlapping pairs of
consecutive variables $x↓jx↓{j+1}$; there are $F↓{n+1}$ such
terms. The polynomials defined in (4) are called ``continuants.''
The basic property of the $Q$-polynomials is that
$$\bslash x↓1, x↓2, \ldotss , x↓n\bslash = Q↓{n-1}(x↓2, \ldotss
, x↓n)/Q↓n(x↓1, x↓2, \ldotss , x↓n),\qquad n ≥ 1.\eqno (5)$$
This can be proved by induction, since it implies
that
$$x↓0 + \bslash x↓1, \ldotss , x↓n\bslash = Q↓{n+1}(x↓0, x↓1,
\ldotss , x↓n)/Q↓n(x↓1, \ldotss , x↓n);$$
hence $\bslash x↓0, x↓1, \ldotss , x↓n\bslash$
is the reciprocal of the latter quantity.
The $Q$-polynomials are symmetrical in the sense
that
$$Q↓n(x↓1, x↓2, \ldotss , x↓n) = Q↓n(x↓n, \ldotss , x↓2, x↓1).\eqno(6)$$
This follows from Euler's observation above, and
as a consequence we have
$$Q↓n(x↓1, \ldotss , x↓n) = x↓nQ↓{n-1}(x↓1, \ldotss , x↓{n-1})
+ Q↓{n-2}(x↓1, \ldotss , x↓{n-2})\eqno (7)$$
for $n>1$. The $Q$-polynomials also satisfy the important
identity
$$\twoline{Q↓n(x↓1, \ldotss , x↓n)Q↓n(x↓2, \ldotss , x↓{n+1}) -
Q↓{n+1}(x↓1, \ldotss , x↓{n+1})Q↓{n-1}(x↓2, \ldotss , x↓n)}{4pt}{=
(-1)↑n,\qquad n ≥ 1.\qquad (8)\hskip-10pt}$$
(See exercise 4.) The latter equation in connection
with (5) implies that
$$\twoline{\bslash x↓1, \ldotss , x↓n\bslash = {1\over q↓0q↓1}
- {1\over q↓1q↓2} + {1\over q↓2q↓3} - \cdots + {(-1)↑{n-1}\over
q↓{n-1}q↓n},}{4pt}{\hbox{where }q↓k = Q↓k(x↓1, \ldotss , x↓k).\qquad (9)
\hskip-10pt}$$
Thus the $Q$-polynomials are intimately related
to continued fractions.
Every real number $X$ in the range $0 ≤ X < 1$
has a {\sl regular continued fraction} defined as follows: Let
$X↓0 = X$, and for all $n ≥ 0$ such that $X↓n ≠ 0$ let
$$A↓{n+1} = \lfloor 1/X↓n\rfloor ,\qquad X↓{n+1} = 1/X↓n -
A↓{n+1}.\eqno (10)$$
If $X↓n = 0$, the quantities $A↓{n+1}$ and $X↓{n+1}$
are not defined, and the regular continued fraction for $X$
is $\bslash A↓1, \ldotss , A↓n\bslash $. If $X↓n ≠ 0$, this definition
guarantees that $0 ≤ X↓{n+1} < 1$, so each of the $A$'s is a
positive integer. The definition (10) clearly implies that
$$X = X↓0 = {1\over A↓1 + X↓1} = {1\over A↓1 +
1/(A↓2 + X↓2)} = \cdotss;$$
hence
$$X = \bslash A↓1, \ldotss , A↓{n-1}, A↓n + X↓n\bslash \eqno(11)$$
for all $n ≥ 1$, whenever $X↓n$ is defined. In particular, we have
$X = \bslash A↓1, \ldotss , A↓n\bslash $ when $X↓n=0$. If $X↓n
≠ 0$, the number $X$ lies {\sl between} the two values
$\bslash A↓1, \ldotss , A↓n\bslash$
and $\bslash A↓1, \ldotss , A↓n + 1\bslash $, since by (7)
the quantity $q↓n = Q↓n(A↓1, \ldotss , A↓n + X↓n)$ increases
monotonically from $Q↓n(A↓1, \ldotss , A↓n)$ up to $Q↓n(A↓1,
\ldotss , A↓n + 1)$ as $X↓n$ increases from 0 to 1, and by (9)
the continued fraction increases or decreases when $q↓n$ increases,
according as $n$ is even or odd. In fact,
$$\baselineskip14pt\eqalignno{|X - \bslash A↓1, \ldotss , A↓n\bslash | ⊗ =
|\bslash A↓1, \ldotss , A↓n + X↓n\bslash - \bslash A↓1, \ldotss
, A↓n\bslash |\cr
⊗= |\bslash A↓1, \ldotss , A↓n, 1/X↓n\bslash
- \bslash A↓1, \ldotss , A↓n\bslash |\cr
\noalign{\vskip 3pt}
⊗=\left|{Q↓n(A↓2, \ldotss , A↓n, 1/X↓n)\over Q↓{n+1}(A↓1,
\ldotss , A↓n, 1/X↓n)} - {Q↓{n-1}(A↓2, \ldotss , A↓n)\over Q↓n(A↓1,
\ldotss , A↓n)}\right| \cr
\noalign{\vskip 3pt}
⊗= 1/Q↓n(A↓1, \ldotss , A↓n)Q↓{n+1}(A↓1, \ldotss , A↓n,
1/X↓n)\cr
⊗≤ 1/Q↓n(A↓1, \ldotss , A↓n)Q↓{n+1}(A↓1, \ldotss , A↓n,
A↓{n+1})⊗(12)\cr}$$
by (5), (8), and (10). Therefore $\bslash A↓1, \ldotss
, A↓n\bslash$ is an extremely close approximation to $X$. If
$X$ is irrational, it is impossible to have $X↓n = 0$ for any
$n$, so the regular continued fraction expansion in this case
is an {\sl infinite continued fraction} $\bslash A↓1, A↓2, A↓3, \ldotss
\bslash $. The value of an infinite continued fraction is defined
to be $\lim↓{n→∞}\bslash A↓1, A↓2, \ldotss , A↓n\bslash $, and
from the inequality (12) it is clear that this limit equals
$X$.
The regular continued fraction expansion of real
numbers has several prop\-er\-ties analogous to the representation
of numbers in the decimal system. If we use the formulas above
to compute the regular continued fraction expansions of some
familiar real numbers, we find, for example, that
\ninepoint$$\baselineskip12pt\eqalignno{
\textstyle{8\over 29} ⊗= \bslash 3, 1, 1, 1, 2\bslash;\cr
\textstyle\quad\sqrt{8\over 29} ⊗= \bslash 1, 1, 9, 2,
2, 3, 2, 2, 9, 1, 2, 1, 9, 2, 2, 3, 2, 2, 9, 1, 2, 1, 9, 2,
2, 3, 2, 2, 9, 1, \ldotss\bslash ;\cr
\spose{\raise4.5pt\hbox{\hskip2.25pt$\scriptscriptstyle3$}}\sqrt2
⊗= 1 + \bslash 3, 1,
5, 1, 1, 4, 1, 1, 8, 1, 14, 1, 10, 2, 1, 4, 12, 2, 3, 2, 1,
3, 4, 1, 1, 2, 14, 3, \ldotss\bslash ;\cr
π ⊗= 3 + \bslash 7, 15, 1, 292, 1, 1, 1, 2, 1,
3, 1, 14, 2, 1, 1, 2, 2, 2, 2, 1, 84, 2, 2, 1, 1, 15, 3,
\ldotss\bslash ;\cr
e ⊗= 2 + \bslash 1, 2, 1, 1, 4, 1, 1, 6, 1, 1,
8, 1, 1, 10, 1, 1, 12, 1, 1, 14, 1, 1, 16, 1, 1, 18, 1, \ldotss
\bslash ;\cr
\gamma ⊗= \bslash 1, 1, 2, 1, 2, 1, 4, 3, 13, 5, 1,
1, 8, 1, 2, 4, 1, 1, 40, 1, 11, 3, 7, 1, 7, 1, 1, 5, \ldotss\bslash;\cr
\phi ⊗= 1 + \bslash 1, 1, 1, 1, 1, 1, 1, 1, 1,\ldotss\bslash .⊗\hbox{\:a(13)}\cr}$$
\tenpoint The numbers $A↓1$, $A↓2$, $\ldots$ are called the
{\sl partial quotients} of $X$. Note the regular pattern that
appears in the partial quotients for $\sqrt{8/29}$, $\phi $, and
$e$; the reasons for this behavior are discussed in exercises
12 and 16. There is no apparent pattern in the partial quotients
for $\spose{\raise5pt\hbox{\hskip2.5pt$\scriptscriptstyle3$}}\sqrt{2}$,
$π$, or $\gamma $.
It is interesting to note that the ancient Greeks'
first definition of real numbers, once they had discovered the
existence of irrationals, was essentially stated in terms of infinite
continued fractions.\xskip (Later they adopted the suggestion of Eudoxus
that $x = y$ should be defined instead as ``$x < r$ if and only
if $y < r$, for all rational $r$.'') See O. Becker, {\sl Quellen
und Studien zur Geschichte Math., Astron., Physik} (B) {\bf 2}
(1933), 311--333.
%folio 444 galley 12 (C) Addison-Wesley 1978 *
\def\bslash{\char'477 } % boldface slash (vol. 2 only)
\yskip When $X$ is a rational number, the regular
continued fraction corresponds in a natural way to Euclid's
algorithm. Let us assume that $X = v/u$, where $u > v ≥ 0$.
The regular continued fraction process starts with $X↓0 = X$;
let us define $U↓0 = u$, $V↓0 = v$. Assuming that $X↓n = V↓n/U↓n
≠ 0$, (10) becomes
$$\baselineskip15pt\cpile{A↓{n+1} = \lfloor U↓n/V↓n\rfloor,\cr
X↓{n+1} = U↓n/V↓n - A↓{n+1} = (U↓n\mod V↓n)/V↓n.\cr}\eqno(14)$$
Therefore, if we define
$$U↓{n+1} = V↓n,\qquad V↓{n+1} = U↓n\mod V↓n,\eqno (15)$$
the condition $X↓n = V↓n/U↓n$ holds throughout
the process. Furthermore, (15) is precisely the transformation
made on the variables $u$ and $v$ in Euclid's algorithm (see Algorithm
4.5.2A\null, step A2). For example, since ${8\over 29} = \bslash
3, 1, 1, 1, 2\bslash$, we know that Euclid's algorithm applied
to $u = 29$ and $v = 8$ will require exactly five division steps,
and the quotients $\lfloor u/v\rfloor$ in step A2 will be successively
3, 1, 1, 1, and 2. Note that the last partial quotient $A↓n$
must be 2 or more when $X↓n = 0$, $n ≥ 1$, since $X↓{n-1}$ is
less than unity.
From this correspondence with Euclid's algorithm
we can see that the regular continued fraction for $X$ terminates
at some step with $X↓n = 0$ if and only if $X$ is rational;
for it is obvious that $X↓n$ cannot be zero if $X$ is irrational,
and, conversely, we know that Euclid's algorithm always terminates.
If the partial quotients obtained during Euclid's algorithm
are $A↓1$, $A↓2$, $\ldotss$, $A↓n$, then we have, by (5),
$${v\over u} = {Q↓{n-1}(A↓2, \ldotss , A↓n)\over Q↓n(A↓1, A↓2,
\ldotss , A↓n)} .\eqno (16)$$
This formula holds also if Euclid's algorithm is
applied for $u < v$, when $A↓1 = 0$. Furthermore, because of
(8), $Q↓{n-1}(A↓2, \ldotss , A↓n)$ and $Q↓n(A↓1, A↓2, \ldotss
, A↓n)$ are relatively prime, and the fraction on the right-hand
side of (16) is in lowest terms; therefore
$$u = Q↓n(A↓1, A↓2, \ldotss , A↓n)d,\qquad v = Q↓{n-1}(A↓2,
\ldotss , A↓n)d,\eqno (17)$$
where $d = \gcd(u, v)$.
\subsectionbegin{The worst case} We can now apply
these observations to determine the behavior of Euclid's algorithm
in the ``worst case,'' or in other words to give an upper bound
on the number of division steps. The worst case occurs when
the inputs are consecutive Fibonacci numbers:
\algbegin Theorem F ({\rm G. Lam\'e, 1845}). {\sl For $n ≥ 1$, let
$u$ and $v$ be integers with $u > v > 0$ such that Euclid's algorithm
applied to $u$ and $v$ requires exactly $n$ division steps, and such
that $u$ is as small as possible satisfying these conditions.
Then $u = F↓{n+2}$ and $v = F↓{n+1}$.}
\proofbegin By (17), we must have $u = Q↓n(A↓1,
A↓2, \ldotss , A↓n)d$ where $A↓1$, $A↓2$, $\ldotss$, $A↓n$, $d$ are positive
integers and $A↓n ≥ 2$. Since $Q↓n$ is a polynomial with nonnegative
coefficients, involving all of the variables, the minimum value
is achieved only when $A↓1=1$, $\ldotss$, $A↓{n-1}=1$, $A↓n=2$, $d=1$. Putting
these values in (17) yields the desired result.\quad\blackslug
\yyskip (This theorem has the historical claim
of being the first practical application of the Fibonacci sequence;
since then many other applications of Fibonacci numbers
to algorithms and to the study of algorithms have been discovered.)
As a consequence of Theorem F we have an important
corollary:
\thbegin Corollary. {\sl If\/\ $0 ≤ u, v < N$, the number of division
steps required when Algorithm 4.5.2A is applied to $u$ and
$v$ is at most $\,\lceil\log↓\phi(\sqrt{5}\,N)\rceil - 2$.}
\proofbegin By Theorem F\null, the maximum number
of steps, $n$, occurs when $u = F↓n$ and $v = F↓{n+1}$, where
$n$ is as large as possible with $F↓{n+1} < N$.\xskip (The first division
step in this case merely interchanges $u$ and $v$ when $n >
1.$)\xskip Since $F↓{n+1} < N$, we have $\phi ↑{n+1}/\sqrt{5} < N$
(see Eq.\ 1.2.8--15), so $n + 1 <\log↓\phi(\sqrt{5}\,N)$.\quad\blackslug
\yyskip\noindent Note that $\log↓\phi (\sqrt{5}\,N)$ is approximately
$2.078\ln N + 1.672 \approx 4.785\log↓{10} N + 1.672$. See
exercises 31 and 36 for extensions of Theorem F.
\subsectionbegin{An approximate model} Now that we
know the maximum number of division steps that can occur, let
us attempt to find the {\sl average} number. Let $T(m, n)$ be
the number of division steps that occur when $u = m$ and $v = n$
are input to Euclid's algorithm. Thus
$$T(m, 0) = 0;\qquad T(m, n) = 1 + T(n, m \mod n)\qquad\hbox{if }
n ≥ 1.\eqno (18)$$
Let $T↓n$ be the average number of division steps
when $v = n$ and when $u$ is chosen at random; since only the
value of $u\mod v$ affects the algorithm after the first division
step, we may write
$$\chop to 12pt{T↓n = {1\over n} \sum ↓{0≤k<n} T(k, n).}\eqno (19)$$
For example, $T(0, 5) = 1$, $T(1, 5) = 2$, $T(2, 5)
= 3$, $T(3, 5) = 4$, $T(4, 5) = 3$, so
$$\textstyle T↓5 = {1\over 5}(1 + 2 + 3 + 4 + 3) = 2{3\over 5}.$$
In order to estimate $T↓n$ for large $n$,
let us first try an approximation suggested by R. W. Floyd:
We might assume that, for $0 ≤ k < n$, the value of $n$ is essentially ``random''
modulo $k$, so that we can set
$$T↓n \approx 1 + {1\over n}\,(T↓0 + T↓1 + \cdots + T↓{n-1}).$$
Then $T↓n \approx S↓n$, where the sequence $\langle
S↓n\rangle$ is the solution to the recurrence relation
$$S↓0 = 0,\qquad S↓n = 1 + {1\over n}\,(S↓0 + S↓1 +\cdots
+ S↓{n-1}),\qquad n ≥ 1.\eqno (20)$$
(This approximation is analogous to the ``lattice-point
model'' used to investigate Algorithm B in Section 4.5.2.)
The recurrence (20) is readily solved by the use
of generating functions. A more direct way to solve it, analogous
to our solution of the lattice-point model, is by noting that
$$\baselineskip26pt
\eqalign{S↓{n+1}\, ⊗= \,1 + {1\over n + 1}\,(S↓0 + S↓1 +\cdots
+ S↓{n-1} + S↓n)\cr
⊗=\, 1 + {1\over n + 1}\, \biglp n(S↓n - 1) + S↓n\bigrp\,=\, S↓n + {1\over n + 1} ;\cr}$$
hence $S↓n$ is $1 + {1\over 2} +\cdots+
{1\over n} = H↓n$, a harmonic number. The approximation
$T↓n \approx S↓n$ now suggests that $T↓n \approx \ln n + O(1)$.
Comparison of this approximation with tables of
the true value of $T↓n$ show, however, that $\ln n$ is too large;
$T↓n$ does not grow this fast. One way to account for the fact
that this approximation is too pessimistic is to observe that
the average value of $n \mod k$ is less than the average value
of ${1\over 2}k$, in the range $1 ≤ k ≤ n$:
$$\eqalignno{{1\over n}\sum ↓{1≤k≤n} (n\mod k)⊗ = {1\over n}\hskip-6pt \sum
↓{\scriptstyle 1≤q≤n\atop\scriptstyle\lfloor n/(q+1)\rfloor<k≤\lfloor n/q\rfloor}
\hskip-8pt (n- qk)\cr⊗ = n - {1\over n} \sum ↓{1≤q≤n} q\,\bigglp{\lfloor n/q\rfloor
+ 1\choose 2}-{\lfloor n/(q + 1)\rfloor + 1\choose 2}\biggrp\cr
⊗= n - {1\over n} \sum ↓{1≤q≤n}\!{\lfloor n/q\rfloor +
1\choose 2} = \bigglp 1 - {π↑2\over 12}\biggrp\,n + O(\log n)⊗(21)\cr}$$
$\biglp$cf.\ exercise 4.5.2--10(c)$\bigrp$. This is only
about $.1775n$, not $.25n$; so the value of $n\mod k$ tends
to be smaller than the above model predicts, and Euclid's algorithm
works faster than we might expect.
%folio 448 galley 13 (C) Addison-Wesley 1978 *
\def\bslash{\char'477 } % boldface slash (vol. 2 only)
\subsectionbegin{A continuous model} The behavior
of Euclid's algorithm with $v = N$ is essentially determined
by the behavior of the regular continued fraction process when
$X = 0/N$, $1/N$, $\ldotss$, $(N - 1)/N$. Assuming that $N$ is very
large, we are led naturally to a study of regular continued
fractions when $X$ is a random real number uniformly distributed in
$[\,0, 1)$. Therefore let us define the distribution function
$$F↓n(x) =\hbox{probability that }X↓n ≤ x,\qquad\hbox{for }0
≤ x ≤ 1,\eqno (22)$$
given a uniform distribution of $X = X↓0$. By the
definition of regular continued fractions, we have $F↓0(x) =
x$, and
$$\eqalignno{F↓{n+1}(x) ⊗= \sum ↓{k≥1}\hbox{probability that}(k
≤ 1/X↓n ≤ k + x)\cr
⊗= \sum ↓{k≥1}\hbox{probability that}\biglp 1/(k + x) ≤ X↓n ≤
1/k\bigrp\cr
⊗= \sum ↓{k≥1} \biglp F↓n(1/k) - F↓n\biglp 1/(k +x)\bigrp\bigrp.⊗(23)\cr}$$
If the distributions $F↓0(x)$, $F↓1(x)$, $\ldots$ defined
by these formulas approach a limiting distribution $F↓∞(x) =
F(x)$, we will have
$$\chop to 12pt{F(x) = \sum ↓{k≥1} \biglp F(1/k) - F\biglp 1/(k + x)\bigrp\bigrp.}
\eqno (24)$$
One function that satisfies this relation
is $F(x) = \log↓b (1 + x)$, for any base $b > 1$; see exercise
19. The further condition $F(1) = 1$ implies that we should
take $b = 2$. Thus it is reasonable to make a guess that $F(x)
= \lg (1 + x)$, and that $F↓n(x)$ approaches this behavior.
We might conjecture, for example, that $F({1\over
2}) =\lg({3\over 2}) \approx 0.58496$; let us see how close
$F↓n({1\over 2})$ comes to this value for small $n$. We have
$F↓0({1\over 2})={1\over2}$, and
$$\baselineskip26pt\eqalign{F↓1(\textstyle{1\over 2}) ⊗={1\over1}-{1\over1+
\vcenter{\hbox{$\scriptstyle{1\over2}$}}}
+{1\over 2}-{1\over 2 +\vcenter{\hbox{$\scriptstyle{1\over2}$}}} +\cdots \cr
⊗= 2\,\left({1\over 2} - {1\over 3} + {1\over 4} - {1\over
5} +\cdots\right) = 2(1 - \ln 2) \approx 0.6137;\cr
F↓2(\textstyle{1\over 2}) ⊗= \sum ↓{m≥1} {2\over m}\,\left({1\over
2m + 2} - {1\over 3m + 2} + {1\over 4m + 2} - {1\over 5m + 2}
+\cdots \right)\cr
⊗= \sum ↓{m≥1} {2\over m↑2}\,\left({1\over 2} - {1\over
3} + {1\over 4} -\cdots\right)\cr
⊗\quad\null - \sum ↓{m≥1} {4\over m}\,\left({1\over 2m(2m + 2)}
- {1\over 3m(3m + 2)} +\cdots\right)\cr
⊗= {1\over 3}π↑2(1 - \ln 2) - \sum ↓{m≥1} {4S↓m\over
m↑2} ,\cr}$$
where $S↓m=1/(4m+4)-1/(9m+6)+1/(16m+8)-\cdotss$. Using the values of
$H↓x$ for fraction $x$ found in Table 3 of Appendix B, we find that
$$\textstyle S↓1={1\over12},\qquad S↓2={3\over4}-\ln 2,\qquad S↓3={19\over20}-π/
(2\sqrt3\,),$$
etc.; a numerical evaluation yields $F↓2({1\over2})\approx0.5748$. Although
$F↓1(x)=H↓x$, it is clear that $F↓n(x)$ is difficult to calculate exactly
when $n$ is large.
The distributions $F↓n(x)$ were first studied by K. F. Gauss, who thought of
the problem in 1800. His notebook for that year lists various recurrence
relations and gives a brief table of values, including the four-place value for
$F↓2({1\over2})$ that has just been mentioned. After performing these
calculations, Gauss wrote, ``{\sl Tam complicat\ae\ evadunt, ut nulla spes
superesse videatur},'' i.e., ``They come out so complicated that no hope appears
to be left.'' Twelve years later, Gauss wrote a letter to Laplace in which he
posed the problem as one he could not resolve to his satisfaction. He said,
``I found by very simple reasoning that, for $n$ infinite, $F↓n(x)=\log(1+x)/\!
\log 2$. But the efforts which I made since then in my inquiries to assign
$F↓n(x)-\log(1+x)/\!\log 2$ for very large but not infinite values of $n$ were
fruitless.''\xskip He never published his ``very simple reasoning,'' and it is
not completely clear that he had found a rigorous proof. More than 100 years
went by before a proof was finally published, by R. O. Kuz'min [{\sl Atti del
Congresso internazionale dei matematici \bf6} (Bologna, 1928), 83--89], who
showed that$$F↓n(x)=\lg(1+x)+O(e↑{-A\sqrt n\,})$$
for some positive constant $A$. The error term was improved
to $O(e↑{-An})$ by Paul L\'evy shortly afterward [{\sl
Bull.\ Soc.\ Math.\ de France \bf 57} (1929), 178--194]\footnote*{An
exposition of L\'evy's interesting proof appeared in the first edition of
this book.}; but
Gauss's problem, namely to find the asymptotic behavior of $F↓n(x)
- \lg(1 + x)$, was not really resolved until 1974, when Eduard
Wirsing published a beautiful analysis of the situation [{\sl
Acta Arithmetica \bf 24} (1974), 507--528]. We shall study the
simplest aspects of Wirsing's approach here, since his method
is an instructive use of linear operators.
If $G$ is any function of $x$ defined for $0≤ x
≤ 1$, let $SG$ be the function defined by
$$\chop to 12pt{SG(x) = \sum ↓{k≥1} \left(G\left({1\over k}\right)
- G\left({1\over k + x}\right)\right).}\eqno (25)$$
Thus, $S$ is an operator that changes one function
into another. In particular, by (23) we have $F↓{n+1}(x) = SF↓n(x)$,
hence
$$F↓n = S↑nF↓0.\eqno (26)$$
(In this discussion $F↓n$ stands for a distribution
function, {\sl not} for a Fibonacci number.) Note that $S$ is
a ``linear operator''; i.e., $S(cG) = c(SG)$ for all constants $c$,
and $S(G↓1 + G↓2) = SG↓1 + SG↓2$.
Now if $G$ has a bounded first derivative, we can
differentiate (25) term by term to show that
$$\chop to 12pt{(SG)↑\prime (x) = \sum ↓{k≥1} {1\over (k + x)↑2} G↑\prime
\left({1\over k + x}\right),}\eqno (27)$$
hence $SG$ also has a bounded first derivative.
$\biglp$Term-by-term differentiation of a convergent series is justified
when the series of derivatives is uniformly convergent; cf.\
K. Knopp, {\sl Theory and Application of Infinite series} (Glasgow:
Blackie, 1951), $\section$47.$\bigrp$
Let $H = SG$, and let $g(x) = (1 + x)G↑\prime (x)$,
$h(x) = (1 + x)H↑\prime (x)$. It follows that
$$\eqalign{h(x)⊗ = \sum ↓{k≥1}\;{1 + x\over (k + x)↑2}\; \left(1 + {1\over
k + x}\right)↑{-1}g\left(1\over k + x\right)\cr⊗=\sum ↓{k≥1}\left({k\over k
+ 1 + x} - {k - 1\over k + x}\right)\,g\left(1\over k + x\right).\cr}$$
In other words, $h = Tg$, where $T$ is the linear
operator defined by
$$\chop to 12pt{Tg(x) = \sum ↓{k≥1} \left({k\over k + 1 + x} - {k - 1\over
k + x}\right)\,g\left(1\over k + x\right).}\eqno (28)$$
Continuing, we see that if $g$ has a bounded first
derivative, we can differentiate term by term to show that $Tg$
does also:
$$\eqalign{(Tg)↑\prime (x) ⊗ = - \sum ↓{k≥1}\biggglp \left({k\over (k + 1
+ x)↑2}-{k-1\over(k+x)↑2}\right)\,g\left(1\over k + x\right)\cr
⊗\hskip50pt\null+\left({k\over k+1+x}-{k-1\over k+x}\right){1\over(k+x)↑2}
g↑\prime\left(1\over k+x\right)\bigggrp\cr
\noalign{\vskip4pt}
⊗=-\sum↓{k≥1}\biggglp{k\over(k+1+x)↑2}\left(g\left(1\over k+x\right)-g\left(1
\over k+1+x\right)\right)\cr
⊗\hskip50pt\null+{1+x \over (k+x)↑3(k+1+x)}g↑\prime\left(1\over k+x\right)\bigggrp\,
.\cr}$$
There is consequently a third linear operator, $U$, such
that $(Tg)↑\prime = -U(g↑\prime )$, namely
$$\twoline{U\varphi (x) = \sum ↓{k≥1}\biggglp{k\over (k + 1 + x)↑2} \int
↑{1/(k+x)}↓{1/(k+1+x)} \varphi (t)\,dt}{0pt}{\null + {1 + x\over (k + x)↑3(k
+ 1 + x)} \varphi \left(1\over k + x\right)\bigggrp.\qquad(29)\hskip-10pt}$$
%folio 451 galley 14 (C) Addison-Wesley 1978 *
\def\bslash{\char'477 } % boldface slash (vol. 2 only)
What is the relevance of all this to our problem?
Well, if we set
$$\eqalignno{F↓n(x)⊗=\lg(1 + x) + R↓n\biglp\lg(1 + x)\bigrp,⊗(30)\cr
\noalign{\vskip4pt}
f↓n(x) = (1 + x)\,F↑\prime↓{\!n}(x)⊗= {1\over
\ln 2} \biglp 1 + R↑\prime↓{n}\biglp\lg(1 + x)\bigrp\bigrp,⊗(31)\cr}$$
we have
$$f↑\prime↓{\!n}(x) = R↓{n}↑{\prime\prime}\biglp\lg(1 + x)\bigrp/\biglp(\ln 2)↑2(1
+ x)\bigrp;\eqno (32)$$
the effect of the $\lg(1 + x)$ term disappears, after
these transformations. Further\-more since $F↓n = S↑nF↓0$ we have
$f↓n = T↑nf↓0$ and $f↑\prime↓{\!n} = (-1)↑nU↑nf↑\prime↓{0}$,
and $F↓n$ and $f↓n$ have bounded derivatives, by induction on
$n$. Thus (32) becomes
$$(-1)↑nR↓{n}↑{\prime\prime}\biglp\lg(1+x)\bigrp=(1+x)(\ln2)↑2U↑nf↓0↑\prime(x).
\eqno (33)$$
Now $F↓0(x) = x$, $f↓0(x) = 1 + x$, and
$f↓0↑\prime(x)$ is the constant function 1. We are going
to show that the operator $U↑n$ takes the constant function
into a function with very small values, hence $|R↓{n}↑{\prime\prime}(x)|$ must
be very small for $0 ≤ x ≤ 1$. Finally we can clinch the argument
by showing that $R↓n(x)$ itself is small: Since $R↓n(0) = R↓n(1)
= 1$, it follows from a well-known interpolation formula (cf.\
exercise 4.6.4--15 with $x↓0 = 0$, $x↓1 = x$, $x↓2 = 1$) that
$$R↓n(x) = -\,{x(1 - x)\over 2}\, R↓{n}↑{\prime\prime}\biglp\xi(x)\bigrp\eqno (34)$$
for some function $\xi (x)$, where $0 ≤ \xi (x) ≤ 1$ when
$0 ≤ x ≤ 1$.
Thus everything hinges on our being able to prove
that $U↑n$ produces small function values, where $U$ is the
linear operator defined in (29). Note that $U$ is a {\sl positive}
operator, in the sense that $U\varphi (x) ≥ 0$ for all $x$ if
$\varphi (x) ≥ 0$ for all $x$. It follows that $U$ is order-preserving:
If $\varphi ↓1(x) ≤ \varphi ↓2(x)$ for all $x$ then $U\varphi
↓1(x) ≤ U\varphi ↓2(x)$ for all $x$.
One way to exploit this property is to find a function
$\varphi$ for which we can calculate $U\varphi$ exactly and
to use constant multiples of this function to bound the ones
that we are really interested in. First let us look for a function $g$
such that $Tg$ is easy to compute. If we consider functions
defined for all $x ≥ 0$, instead of only on $[0, 1]$, it is easy
to remove the summation from (25) by observing that
$$SG(x + 1) - SG(x) = G\left(1\over 1 + x\right) - \lim↓{k→∞}
G\left(1\over k + x\right) = G\left(1\over 1 + x\right) - G(0)\eqno(35)$$
when $G$ is continuous. Since $T\biglp (1 + x)G↑\prime
\bigrp = (1 + x)(SG)↑\prime $, it follows (see exercise 20)
that
$${Tg(x)\over x + 1} - {Tg(x + 1)\over x + 2} = \left({1\over
x + 1} - {1\over x + 2}\right)g\left(1\over 1 + x\right).\eqno (36)$$
If we set $Tg(x) = 1/(x + 1)$, we find that the
corresponding value of $g(x)$ is ${1 + x} - 1/(1 + x)$. Let $\varphi
(x) = g↑\prime (x) = 1 + 1/(1 + x)↑2$, so that $U\varphi (x)
= -(Tg)↑\prime (x) = 1/(1 + x)↑2$; this is the function $\varphi$
we have been looking for.
For this choice of $\varphi$ we have $2 ≤ \varphi
(x)/U\varphi (x) = (1 + x)↑2 + 1 ≤ 5$ for $0 ≤ x ≤ 1$, hence
$$\textstyle{1\over 5}\varphi ≤ U\varphi ≤ {1\over 2}\varphi .$$
By the positivity of $U$ and $\varphi$ we can apply
$U$ to this inequality again, obtaining ${1\over 25}\varphi
≤ {1\over 5}U\varphi ≤ U↑2\varphi ≤ {1\over2}U\varphi ≤{1\over 4}\varphi
$; and after $n - 1$ applications we have
$$5↑{-n}\varphi ≤ U↑n\varphi ≤ 2↑{-n}\varphi \eqno (37)$$
for this particular $\varphi $. Let $\chi (x) =
f↑\prime↓{0}(x) = 1$ be the constant function; then for $0
≤ x ≤ 1$ we have ${5\over 4}\chi ≤ \varphi ≤ 2\chi $, hence
$$\textstyle{5\over 8}5↑{-n}\chi ≤ {1\over 2}5↑{-n}\varphi ≤ {1\over
2}U↑n\varphi ≤ U↑n\chi ≤ {4\over 5}U↑n\varphi ≤ {4\over 5}2↑{-n}\varphi
≤ {8\over 5}2↑{-n}\chi .$$
It follows by (33) that
$${\textstyle{5\over 8}}(\ln 2)↑25↑{-n} ≤ (-1)↑nR↓{n}↑{\prime\prime}(x)
≤ {\textstyle{16\over 5}}(\ln
2)↑22↑{-n},\qquad 0 ≤ x ≤ 1,$$
hence by (30) and (34) we have proved the following result:
\thbegin Theorem W. $F↓n(x) = \lg(1 + x) + O(2↑{-n})$.
{\sl In fact, $F↓n(x) - \lg(1 + x)$ lies between ${5\over 16}(-1)↑{n+1}5↑{-n}\biglp
\ln(1 + x)\bigrp\biglp\ln 2/(1 + x)\bigrp$ and\/\ ${8\over 5}(-1)↑{n+1}2↑{-n}\biglp
\ln(1 + x)\bigrp\biglp\ln 2/(1 + x)\bigrp$, for $0 ≤ x ≤ 1$}.\quad\blackslug
With a slightly different choice of $\varphi
$, we can obtain tighter bounds (see exercise 21). In fact,
Wirsing went much further in his paper, proving that
$$F↓n(x) = \lg(1 + x) + (-λ)↑n\Psi (x) + O\biglp x(1 - x)(λ - 0.031)↑n\bigrp,
\eqno(38)$$
where
$$\baselineskip15pt\eqalign{λ ⊗= 0.30366\ 30028\ 98732\ 65860\ \ldots\cr
⊗= \bslash 3, 3, 2, 2, 3, 13, 1, 174, 1, 1, 1, 2, 2,
2, 1, 1, 1, 2, 2, 1, \ldotss\bslash\cr} \eqno (39)$$
is a fundamental constant (apparently unrelated to more familiar constants), and
where $\Psi$ is an interesting function that is analytic in
the entire complex plane except for the negative real axis from
$-1$ to $-∞$. Wirsing's function satisfies $\Psi (0) = \Psi (1)
= 0$, $\Psi ↑\prime (0) < 0$, and $S\Psi = -λ\Psi $; thus by (35)
it satisfies the identity
$$\Psi (z) - \Psi (z + 1) = {1\over λ} \Psi \left(1\over 1+ z\right).\eqno (40)$$
Furthermore, Wirsing demonstrated that
$$\Psi \left(-{u\over v} + {i\over N}\right) = cλ↑{-n}\log N +
O(1)\qquad\hbox{as }N → ∞,\eqno (41)$$
where $c$ is a constant and $n = T(u, v)$ is the
number of iterations when Euclid's algorithm is applied to the
integers $u > v > 0$.
A complete solution to Gauss's problem was found a few years later by K. I. Babenko
[{\sl Doklady Akad.\ Nauk SSSR\/\ \bf238} (1978), 1021--1024],
who used powerful techniques of functional analysis to prove that
$$\chop to 9pt{F↓n(x)=\lg(1+x)+\sum↓{j≥2}λ↓j↑n\,\Psi↓j(x)}\eqno(42)$$
for all $0≤x≤1$, $n≥1$. Here $|λ↓2|>|λ↓3|≥|λ↓4|≥\cdotss$, and each $\Psi↓j(z)$
is an analytic function in the complex plane except for a cut at
$[-∞,-1]$. The function $\Psi↓2$ is Wirsing's $\Psi$, and $λ↓2=-λ$, while
$λ↓3=0.10088$. Babenko also
established further properties of the $λ↓j$,
proving in particular that the sum for $j≥k$ in (42) is bounded by
$(π↑2/6)|λ↓k|↑{n-1}\min(x,1-x)$.
%folio 460 galley 15 (C) Addison-Wesley 1978 *
\def\bslash{\char'477 } % boldface slash (vol. 2 only)
\subsectionbegin{From continuous to discrete} We
have now derived results about the prob\-ability distributions
for continued fractions when $X$ is a real number uniformly
distributed in the interval $[\,0, 1)$. But a real number
is rational with probability zero (almost all numbers are irrational),
so these results do not apply directly to Euclid's algorithm. Before
we can apply Theorem W to our problem, some technicalities must
be overcome. Consider the following observation based on elementary
measure theory:
\thbegin Lemma M. {\sl Let $I↓1$, $I↓2$, $\ldotss$, $J↓1$, $J↓2$,
$\ldots$ be pairwise disjoint intervals contained in the interval
$[\,0, 1)$, and let $$\Iscr = \union↓{k≥1} I↓k,\qquad \Jscr = \union↓{k≥1}
J↓k,\qquad\Kscr = [0, 1] \rslash (\Iscr ∪ \Jscr).$$
Assume that $\Kscr$ has measure zero.
Let $P↓n$ be the set $\{0/n, 1/n, \ldotss , (n - 1)/n\}$. Then}
$$\lim↓{n→∞} {\| \Iscr ∩ P↓n\| \over n} = \mu (\Iscr).\eqno (43)$$
Here $\mu (\Iscr)$ is the Lebesgue measure of $\Iscr$,
namely, $\sum ↓{k≥1}\hbox{length}(I↓k)$; and ${\|\Iscr ∩ P↓n\|}$
denotes the number of elements in the set $\Iscr ∩ P↓n$.
\proofbegin Let $\Iscr↓N = \union↓{1≤k≤N\lower3pt\null}
I↓k$ and $\Jscr↓N = \union↓{1≤k≤N} J↓k$. Given $ε > 0$, find $N$
large enough so that $\mu (\Iscr↓N) + \mu (\Jscr↓N) ≥ 1 - ε$, and
let
$$\Kscr↓N = \Kscr \;∪ \union↓{k>N} I↓k\;\; ∪ \union↓{k>N} J↓k.$$
If $I$ is an interval, having any of the forms
$(a, b)$ or $[a, b)$ or $(a, b]$ or $[a, b]$, it is clear that
$\mu (I) = b - a$ and
$$n\mu (I) - 1 ≤ \| I ∩ P↓n\| ≤ n\mu (I) + 1.$$
Now let $r↓n = \| \Iscr↓N ∩ P↓n\|$,
$s↓n = \| \Jscr↓N ∩ P↓n\|$, $t↓n = \| \Kscr↓N
∩ P↓n\|$; we have
$$\vbox{\halign{$\hfill#$⊗$\null#\hfill$⊗$\null#\hfill$\cr
⊗\hfill\hskip-100pt r↓n + s↓n + t↓n = n;\hskip-100pt\cr
\noalign{\vskip 4pt}
n\mu (\Iscr↓N) - N⊗≤ r↓n⊗≤ n\mu (\Iscr↓N) + N;\cr
\noalign{\vskip 2pt}
n\mu (\Iscr↓N) - N⊗≤ s↓n⊗≤ n\mu (\Jscr↓N) + N.\cr}}$$
Hence
$$\twoline{\mu(\Iscr) - {N\over n} - ε ≤ \mu (\Iscr↓N) - {N\over n} ≤ {r↓n\over
n} ≤ {r↓n + t↓n\over n}}{3pt}{ =1- {s↓n\over n} ≤ 1 - \mu (\Jscr↓N) + {N\over
n} ≤ \mu (\Iscr) + {N\over n} + ε.}$$
This holds for all $n$ and for all $ε$; hence
$\lim↓{n→∞} r↓n/n = \mu (\Iscr)$.\quad\blackslug
\yyskip Exercise 25 shows that Lemma M is not trivial,
in the sense that some rather restrictive hypotheses are needed
to prove (43).
\subsectionbegin{Distribution of partial quotients}
Now we can put Theorem W and Lemma M together to derive some
solid facts about Euclid's algorithm.
\thbegin Theorem E. {\sl Let $n$ and $k$ be positive
integers, and let $p↓k(a, n)$ be the probability that the $(k +
1)$st quotient $A↓{k+1}$ in Euclid's algorithm is equal to $a$, when
$v = n$ and $u$ is chosen at random. Then
$$\lim↓{n→∞} p↓k(a, n) = F↓k\left(1\over a\right) - F↓k\left(1\over
a + 1\right),$$
where $F↓k(x)$ is the distribution function $(21)$}.
\proofbegin The set $\Iscr$ of all $X$ in $[\,0,
1)$ for which $A↓{k+1} = a$ is a union of disjoint intervals,
and so is the set $\Jscr$ of all $X$ for which $A↓{k+1} ≠ a$.
Lemma M therefore applies, with $\Kscr$ the set of all $X$ for
which $A↓{k+1}$ is undefined. Furthermore, $F↓k(1/a) - F↓k\biglp1/(a
+ 1)\bigrp$ is the probability that $1/(a + 1)
< X↓k ≤ 1/a$, which is $\mu (\Iscr)$, the probability that $A↓{k+1}
= a$.\quad\blackslug
\yyskip As a consequence of Theorems E and W\null, we
can say that a quotient equal to $a$ occurs with the approximate
probability
$$\lg(1 + 1/a) - \lg\biglp 1 + 1/(a + 1)\bigrp
=\lg\biglp (a + 1)↑2/\biglp (a + 1)↑2 - 1\bigrp\bigrp .$$
Thus
$$\baselineskip14pt\vbox{\halign{a quotient of #⊗ occurs about $\lg({#})$\hfill
⊗$\null=\hfill#$ percent of the time⊗#\hfill\cr
1⊗4\over 3⊗41.504⊗;\cr
2⊗9\over 8⊗16.992⊗;\cr
3⊗16\over 15⊗9.311⊗;\cr
4⊗25\over 24⊗5.890⊗.\cr}}$$
Actually, if Euclid's algorithm produces the quotients
$A↓1$, $A↓2$, $\ldotss$, $A↓t$, the nature of the proofs above will
guarantee this behavior only for $A↓k$ when $k$ is comparatively
small with respect to $t$; the values $A↓{t-1}$, $A↓{t-2}$, $\ldots$
are not covered by this proof. But we can in fact show that
the distribution of the last quotients $A↓{t-1}$, $A↓{t-2}$, $\ldots$
is essentially the same as the first.
%folio 461 galley 16 (C) Addison-Wesley 1978 *
\def\bslash{\char'477 } % boldface slash (vol. 2 only)
For example, consider the regular continued fraction
expansions for the set of all proper fractions whose denominator is 29:
$$\baselineskip15pt
\vbox{\halign to size{$\hfill{#\over29}$⊗$\null=\bslash\,#\hskip .5pt
\bslash$\hfill\tabskip0pt plus100pt
⊗$\hfill{#\over29}$\tabskip0pt⊗$\null=\bslash\,#\hskip .5pt\bslash$\hfill
\tabskip0pt plus 100pt
⊗$\hfill{#\over29}$\tabskip0pt⊗$\null=\bslash\,#\hskip .5pt\bslash$\hfill
\tabskip0pt plus 100pt
⊗$\hfill{#\over29}$\tabskip0pt⊗$\null=\bslash\,#\hskip .5pt\bslash$\hfill\cr
1⊗29⊗8⊗3,1,1,1,2⊗15⊗1,1,14⊗22⊗1,3,7\hskip-1pt\cr
2⊗14,2⊗9⊗3,4,2⊗16⊗1,1,4,3⊗23⊗1,3,1,5\cr
3⊗9,1,2⊗10⊗2,1,9⊗17⊗1,1,2,2,2⊗24⊗1,4,1,4\cr
4⊗7,4⊗11⊗2,1,1,1,3⊗18⊗1,1,1,1,1,3⊗25⊗1,6,4\cr
5⊗5,1,4⊗12⊗2,2,2,2⊗19⊗1,1,1,9⊗26⊗1,8,1,2\cr
6⊗4,1,5⊗13⊗2,4,3⊗20⊗1,2,4,2⊗27⊗1,13,2\cr
7⊗4,7\hskip-1pt⊗14⊗2,14⊗21⊗1,2,1,1,1,2⊗28⊗1,28\cr}}$$
Several things can be observed in this table.
\yskip\textindent{a)}As mentioned earlier, the last quotient is always
2 or more. Furthermore, we have the obvious identity
$$\bslash x↓1, \ldotss , x↓{n-1}, x↓n + 1\bslash = \bslash
x↓1, \ldotss , x↓{n-1}, x↓n, 1\bslash, \eqno (44)$$
and this shows how partial fractions whose last quotient
is unity are related to regular continued fractions.
\textindent{b)}The values in the right-hand columns have a
simple relationship to the values in the left-hand columns;
can the reader see the correspondence before reading any further?
The relevant identity is
$$1 - \bslash x↓1, x↓2, \ldotss , x↓n\bslash = \bslash 1,
x↓1 - 1, x↓2, \ldotss , x↓n\bslash ;\eqno (45)$$
see exercise 9.
\textindent{c)}There is symmetry between left and right in
the first two columns: If $\bslash A↓1, A↓2, \ldotss , A↓t\bslash$
occurs, so does $\bslash A↓t, \ldotss , A↓2, A↓1\bslash $.
This will always be the case (see exercise 26).
\textindent{d)}If we examine all of the quotients in the table,
we find that there are 96 in all, of which ${39\over 96}$ =
40.6 percent are equal to 1, ${21\over 96}$ = 21.9 percent are
equal to 2, ${8\over 96}$ = 8.3 percent are equal to 3; this
agrees reasonably well with the probabilities listed above.
\subsectionbegin{The number of division steps} Let
us now return to our original problem and investigate $T↓n$,
the average number of division steps when $v = n$.\xskip $\biglp$See
Eq.\ (19).$\bigrp$\xskip Here are some sample values of
$T↓n$:
$$\eqalign{\rpile{n =\quad\cr T↓n=\quad\cr}⊗\cpile{95\cr 5.0\cr}\quad
\cpile{96\cr 4.4\cr}\quad\cpile{97\cr 5.3\cr}\quad\cpile{98\cr 4.8\cr}\quad
\cpile{99\cr 4.7\cr}\quad\cpile{100\cr 4.6\cr}\quad\cpile{101\cr 5.3\cr}\quad
\cpile{102\cr 4.6\cr}\quad\cpile{103\cr 5.3\cr}\quad\cpile{104\cr 4.7\cr}\quad
\cpile{105\cr 4.6\cr}\cr\noalign{\vskip7pt}
\rpile{n=\quad\cr T↓n=\quad\cr}⊗\cpile{996\cr 6.5\cr}\quad\cpile{997\cr 7.3\cr}\quad
\cpile{998\cr 7.0\cr}\quad\cpile{999\cr 6.8\cr}\quad\cpile{1000\cr 6.4\cr}\quad
\cpile{1001\cr 6.7\cr}\quad\cpile{\ldots\cr\ldots\cr}\quad\cpile{9999\cr 8.6\cr}
\quad\cpile{10000\cr 8.3\cr}\quad\cpile{10001\cr 9.1\cr}\cr\noalign{\vskip7pt}
\rpile{n=\quad\cr T↓n=\quad\cr}⊗\cpile{49999\cr 10.6\cr}\quad
\cpile{50000\cr 9.7\cr}\quad\cpile{50001\cr 10.0\cr}\quad\cpile{\ldots\cr\ldots\cr}
\quad\cpile{99999\cr 10.7\cr}\quad\cpile{100000\cr 10.3\cr}\quad\cpile{100001\cr
11.0\cr}\cr}$$
Note the somewhat erratic behavior; $T↓n$ tends to be
higher than its neighbors when $n$ is prime, and it is correspondingly
lower when $n$ has many divisors.\xskip (In this list, 97, 101, 103,
997, and 49999 are primes; $10001 = 73 \cdot 137$, $50001 = 3 \cdot
7 \cdot 2381$, $99999 = 3 \cdot 3 \cdot 41 \cdot 271$, and $100001 =
11 \cdot 9091$.)\xskip It is not difficult to understand why this happens:
if $\gcd(u, v) = d$, Euclid's algorithm applied to $u$ and $v$
behaves essentially the same as if it were applied to $u/d$
and $v/d$. Therefore, when $v = n$ has several divisors, there
are many choices of $u$ for which $n$ behaves as if it were
smaller.
Accordingly let us consider {\sl another} quantity,
$\tau ↓n$, which is the average number of division steps when
$v = n$ and when $u$ is {\sl relatively prime} to $n$. Thus
$$\tau ↓n = {1\over \varphi (n)} \sum ↓{\scriptstyle0≤m<n\atop
\scriptstyle\gcd(m,n)=1}
T(m, n).\eqno (46)$$
It follows that
$$T↓n = {1\over n} \sum ↓{d\rslash n} \varphi (d)\tau ↓d.\eqno
(47)$$
Here is a table of $\tau ↓n$ for the same values
of $n$ considered above:
$$\eqalign{\rpile{n =\quad\cr \tau↓n=\quad\cr}⊗\cpile{95\cr 5.4\cr}\quad
\cpile{96\cr 5.3\cr}\quad\cpile{97\cr 5.3\cr}\quad\cpile{98\cr 5.6\cr}\quad
\cpile{99\cr 5.2\cr}\quad\cpile{100\cr 5.2\cr}\quad\cpile{101\cr 5.4\cr}\quad
\cpile{102\cr 5.3\cr}\quad\cpile{103\cr 5.4\cr}\quad\cpile{104\cr 5.3\cr}\quad
\cpile{105\cr 5.6\cr}\cr\noalign{\vskip7pt}
\rpile{n=\quad\cr \tau↓n=\quad\cr}⊗\cpile{996\cr 7.2\cr}\quad\cpile{997\cr 7.3\cr}\quad
\cpile{998\cr 7.3\cr}\quad\cpile{999\cr 7.3\cr}\quad\cpile{1000\cr 7.3\cr}\quad
\cpile{1001\cr 7.4\cr}\quad\cpile{\ldots\cr\ldots\cr}\quad\cpile{9999\cr 9.21\cr}
\quad\cpile{10000\cr 9.21\cr}\quad\cpile{10001\cr 9.22\cr}\cr\noalign{\vskip7pt}
\rpile{n=\quad\cr \tau↓n=\quad\cr}⊗\cpile{49999\cr 10.58\cr}\quad
\cpile{50000\cr 10.57\cr}\quad\cpile{50001\cr 10.59\cr}\quad
\cpile{\ldots\cr\ldots\cr}
\quad\cpile{99999\cr 11.170\cr}\quad\cpile{100000\cr 11.172\cr}\quad\cpile{100001\cr
11.172\cr}\cr}$$
Clearly $\tau ↓n$ is much more well-behaved than $T↓n$,
and it should be more susceptible to analysis. Inspection of
a table of $\tau ↓n$ for small $n$ reveals some curious anomalies;
for example, $\tau ↓{50} = \tau ↓{100}$ and $\tau ↓{60} = \tau
↓{120}$. But as $n$ grows, the values of $\tau ↓n$ behave quite
regularly indeed, as the above table indicates, and they show
no significant relation to the factorization properties of $n$.
If the reader will plot the values of $\tau ↓n$ versus $\ln n$
on graph paper, for the values of $\tau ↓n$ given above, he
will see that the values lie very nearly on a straight line,
and that the formula
$$\tau ↓n \approx 0.843\ln n + 1.47\eqno (48)$$
is a very good approximation.
We can account for this behavior if we study the regular continued fraction
process a little further. Note that in Euclid's algorithm
as expressed in (15) we have
$${V↓0\over U↓0}\,{V↓1\over U↓1} \ldotsm {V↓{t-1}\over U↓{t-1}}
= {V↓{t-1}\over U↓0} ,$$
since $U↓{k+1} = V↓k$; therefore, if $U = U↓0$
and $V = V↓0$ are relatively prime, and if there are $t$ division
steps, we have
$$X↓0X↓1 \ldotsm X↓{t-1} = 1/U.$$
Setting $U = N$ and $V = m < N$, we find that
$$\ln X↓0 + \ln X↓1 +\cdots + \ln X↓{t-1} = -\ln N.\eqno (49)$$
We know the approximate distribution of $X↓0$, $X↓1$,
$X↓2$, $\ldotss $, so we can use this equation to estimate
$$t = T(N, m) = T(m, N) - 1.$$
%folio 465 galley 1 Bad spots. (C) Addison-Wesley 1978 *
\def\bslash{\char'477 } % boldface slash (vol. 2 only)
Returning to the formulas preceding Theorem W\null, we
find that the average value of $\ln X↓n$, when $X↓0$ is a real
number uniformly distributed in $[\,0, 1)$, is
$$\int ↑{1}↓{0} \ln x\,F↑\prime↓{\!n}(x)\,dx =
\int ↑{1}↓{0} \ln x\,f↓n(x)\,dx/(1 + x),\eqno (50)$$
where $f↓n(x)$ is defined in (31). Now
$$f↓n(x) = {1\over \ln 2} + O(2↑{-n}),\eqno (51)$$
using the facts we have derived earlier (see exercise 23);
hence the average value of $\ln X↓n$ is very well approximated by
$$\baselineskip27pt\eqalign{{1\over \ln 2} \int ↑{1}↓{0} {\ln x\over 1 + x}\,dx
⊗= - {1\over \ln 2} \int ↑{∞}↓{0}\hskip-2pt {ue↑{-u}\over 1 + e↑{-u}}\,d↓{\null}u\cr
⊗= - {1\over \ln 2} \sum ↓{k≥1}\;(-1)↑{k+1} \int ↑{∞}↓{0}ue↑{-ku}\,d↓{\null}u\cr
⊗= - {1\over\ln 2} \left(1 - {1\over 4} + {1\over 9} - {1\over16}
+ {1\over 25} - \cdotss\right)\cr
⊗= - {1\over \ln 2}\left(1 + {1\over 4} + {1\over 9} + \cdots
- 2\left({1\over 4} + {1\over 16} + {1\over 36} + \cdotss\right)\right)\cr
⊗= - {1\over 2\ln 2} \left(1 + {1\over 4} + {1\over 9} +
\cdotss\right)\cr
⊗= -π↑2/(12 \ln 2).\cr}$$
By (49) we therefore expect to have the approximate formula
$$-tπ↑2/(12 \ln 2) \approx -\ln N;$$
that is, $t$ should be approximately equal
to $\biglp(12 \ln 2)/π↑2\bigrp \ln N$. This constant $(12 \ln 2)/π↑2 = 0.842765913
\,\ldots$ agrees perfectly with the empirical formula (48) obtained
earlier, so we have good reason to believe that the formula
$$\tau ↓n \approx {12 \ln 2\over π↑2}\,\ln n + 1.47\eqno (52)$$
indicates the true asymptotic behavior of $\tau↓n$ as $n → ∞$.
If we assume that (52) is valid, we obtain
the formula
$$T↓n\approx{12\ln2\overπ↑2}\bigglp\ln n-\sum↓{d\rslash n}\Lambdait(d)/d\biggrp
+1.47,\eqno(53)$$
where $\Lambdait(d)$ is {\sl von Mangoldt's function} defined
by the rules
$$\baselineskip 15pt
\Lambdait(n)=\left\{\vcenter{\halign{$#,\hfill\qquad$⊗#\hfill\cr
\ln p⊗if $n=p↑r$ for $p$ prime and $r≥1$;\cr
0⊗otherwise.\cr}}\right.\eqno(54)$$
For example,
$$\baselineskip15pt\eqalign{T↓{100}⊗\approx{12\ln2\overπ↑2}\left(
\ln 100-{\ln2\over2}-{\ln2\over4}-{\ln2\over5}-{\ln5\over25}\right)+1.47\cr
\noalign{\vskip 3pt}
⊗\approx(0.843)(4.605-0.347-0.173-0.322-0.064)+1.47\cr
⊗\approx 4.59;\cr}$$
the exact value of $T↓{100}$ is 4.56.
We can also estimate the average number of division steps when $u$ and $v$ are
both uniformly distributed between 1 and $N$, by calculating
$${1\over N}\sum↓{1≤n≤N}T↓n.\eqno(55)$$
Assuming formula (53), this sum has the form
$${12\ln2\overπ↑2}\ln N + O(1),\eqno(56)$$
(see exercise 27), and empirical calculations with the same numbers used to
derive Eq.\ 4.5.2--45 show good agreement with the formula
$${12\ln2\overπ↑2}\ln N+0.06.\eqno(57)$$
Of course we have not yet {\sl proved} anything about $T↓n$ and $\tau↓n$ in
general; so far we have only been considering plausible reasons why the
above formulas ought to hold. Fortunately it is now possible to supply
rigorous proofs, based on a careful analysis by several mathematicians.
The leading coefficient $(12\ln2)/π↑2$ in the above formulas was established first,
in independent studies by John D. Dixon and Hans A. Heilbronn.\xskip Dixon
[{\sl J.\penalty1000\ Number
Theory} {\bf 2} (1970), 414--422] developed the theory of the
$F↓n(x)$ distributions to show that individual partial quotients
are essentially independent of each other in an appropriate
sense, and proved that for all positive $ε$ we have $\left|T(m,
n) - \biglp(12\ln 2)/π↑2\bigrp\ln n\right| < (\ln n)↑{(1/2)+ε}$ except
for $\exp\biglp-c(ε)(\log N)↑{ε/2}\bigrp N↑2$ values of $m$ and $n$
in the range $1 ≤ m < n ≤ N$, where $c(ε) > 0$. Heilbronn's
approach was completely different, working entirely with integers
instead of continuous variables. His idea, which is presented
in slightly modified form in exercises 33 and 34, is based on
the fact that $\tau ↓n$ can be related to the number of ways
to represent $n$ in a certain manner. Furthermore, his paper
[{\sl Number Theory and Analysis}, ed.\ by Paul Tur\'an
(New York: Plenum, 1969), 87--96] shows that the distribution
of individual partial quotients 1, 2, $\ldots$ that we have discussed
above actually applies to the entire collection of partial quotients
belonging to the fractions having a given denominator; this
is a sharper form of Theorem E\null. A\penalty1000\ still
sharper result was obtained
several years later by J. W. Porter [{\sl Mathematika} {\bf 22}
(1975), 20--28], who established that $$\tau ↓n = {12\ln2\overπ↑2}
\ln n+C+O(n↑{-1/6+ε}),\eqno(58)$$
where $C = 1.4670780794\,\ldots$ is the constant $$\textstyle
\biglp(6\ln2)/π↑2\bigrp\biglp
3\ln2+4\gamma-24π↑2\zeta↑\prime(2)-2\bigrp-{1\over2};$$ see D. E. Knuth,
{\sl Computers and Math.\ with Applic.\ \bf2} (1976), 137--139.
Thus the conjecture (48) is fully proved.
The average running time for Euclid's
algorithm on multiprecision integers, using classical algorithms
for arithmetic, was shown to be of order $$\biglp1 + \log\biglp\max(u,
v)/\!\gcd(u, v)\bigrp\bigrp\log\min(u, v)$$ by G. E. Collins, in {\sl
SIAM J. Computing} {\bf 3} (1974), 1--10.
\subsectionbegin{Summary} We have found that the worst
case of Euclid's algorithm occurs when its inputs $u$ and
$v$ are consecutive Fibonacci numbers (Theorem F\null); the number
of division steps when $v = n$ will never
exceed $\lceil 4.8\log↓{10} N - 0.32\rceil $. We have determined
the frequency of the values of various partial quotients, showing,
for example, that the division step finds $\lfloor u/v\rfloor
= 1$ about 41 percent of the time (Theorem E\null). And, finally,
the theorems of Heilbronn and Dixon prove that the average number
$T↓n$ of division steps when $v = n$ is approximately
$$\biglp(12 \ln2)/π↑2\bigrp\ln n\approx 1.9405\log↓{10}n.$$ Empirical
calculations show that $T↓n$ is given very accurately by this formula,
minus a correction term based on the divisors of $n$ as shown in Eq.\ (53).
%folio 468 galley 2 Bad spots. (C) Addison-Wesley 1978 *
\def\bslash{\char'477 } % boldface slash (vol. 2 only)
\def\bigbslash{\hbox{\:u\char'77}}
\exbegin{EXERCISES}
\trexno 1. [20] Since the quotient
$\lfloor u/v\rfloor$ is equal to unity over 40 percent of the
time in Algorithm 4.5.2A\null,
it may be advantageous on some computers to make a test
for this case and to avoid the division when the quotient is
unity. Is the following \MIX\ program for Euclid's algorithm more
efficient than Program 4.5.2A?
$$\vbox{\mixthree{\!
⊗LDX⊗U⊗$\rX ← u$.\cr
⊗JMP⊗2F\cr
1H⊗STX⊗V⊗$v ← \rX$.\cr
⊗SUB⊗V⊗$\rA ← u - v$.\cr
⊗CMPA⊗V\cr
⊗SRAX⊗5⊗$\rAX ← \rA$.\cr
⊗JL⊗2F⊗Is $u - v < v$?\cr
⊗DIV⊗V⊗$\rX ← \rAX\mod v$.\cr
2H⊗LDA⊗V⊗$\rA ← v$.\cr
⊗JXNZ⊗1B⊗Done if $\rX = 0$.\quad\blackslug\cr}}$$
\exno 2. [M21] Evaluate the matrix product
$$\left({x↓1\atop 1}\quad {1\atop 0}\right)\left({x↓2\atop 1}\quad
{1\atop 0}\right) \ldotsm \left({x↓n\atop 1}\quad {1\atop 0}\right).$$
\exno 3. [M21] What is the value of
$$\def\\{\hbox to 0pt{\hskip 0pt minus 100pt$-$}}
\det\left(\vcenter{\halign{\quad$\ctr{#}$⊗\qquad$\ctr{#}$⊗\qquad$\ctr{#}$⊗\qquad
$\ctr{#}$⊗\qquad$\ctr{#}$\quad\cr
x↓1⊗1⊗0⊗\ldots⊗0\cr
\\1⊗x↓2⊗1⊗⊗0\cr
0⊗\\1⊗x↓3⊗1⊗\chop to 0pt{\lower 6pt\hbox{$\vdots$}}\cr
⊗⊗\\1\cr\noalign{\vskip-6pt}
\vdots⊗⊗⊗\raise 6pt\hbox{.}\≥\raise 3pt\hbox{.}\≥.⊗1\cr
0⊗0⊗\ldots⊗\\1⊗x↓n\cr}}\right)\,?$$
\exno 4. [M20] Prove Eq.\ (8).
\exno 5. [HM25] Let $x↓1$, $x↓2$, $\ldots$
be a sequence of real numbers that are each greater than some
positive real number $ε$. Prove that the infinite continued
fraction $\bslash x↓1, x↓2,\ldotss\bslash = \lim↓{n→∞}\bslash
x↓1,\ldotss,x↓n\bslash$ exists. Show also that $\bslash x↓1,
x↓2,\ldotss\bslash$ need not exist if we assume only that $x↓j
> 0$ for all $j$.
\exno 6. [M23] Prove that the regular continued fraction expansion
expansion of a number is {\sl unique} in the following sense:
If $B↓1$, $B↓2$, $\ldots$ are positive integers, then the infinite continued
fraction $\bslash B↓1,B↓2,\ldotss\bslash$ is an irrational
number $X$ between 0 and 1 whose regular continued fraction
has $A↓n = B↓n$ for all $n ≥ 1$; and if $B↓1$, $\ldotss$, $B↓m$
are positive integers with $B↓m > 1$, then the regular continued
fraction for $X=\bslash B↓1,\ldotss,B↓m\bslash$ has $A↓n=B↓n$ for
$1≤n≤m$.
\exno 7. [M26] Find all permutations $p(1)p(2)\ldotsm p(n)$
of the integers $\{1, 2,\ldotss,n\}$ such that $Q↓n(x↓1,x↓2,\ldotss,x↓n)=
Q↓n(x↓{p(1)},x↓{p(2)},\ldotss,x↓{p(n)})$
holds for all $x↓1$, $x↓2$, $\ldotss$, $x↓n$.
\exno 8. [M20] Show that $-1/X↓n=\bslash A↓n,\ldotss,A↓1,-X\bslash$, whenever
$X↓n$ is defined, in the regular continued fraction process.
\exno 9. [M21] Show that continued fractions satisfy the following identities:
\yskip\textindent{a)}$\bigbslash x↓1,\ldotss,x↓n\bigbslash=\bigbslash x↓1,\ldotss,
x↓k+\bslash x↓{k+1},\ldotss,x↓n\bslash\bigbslash$,\qquad $1≤k≤n$;
\penalty200\vskip2pt\textindent{b)}$\bslash 0,x↓1,x↓2,\ldotss,x↓n\bslash=x↓1
+\bslash x↓2,\ldotss,x↓n\bslash$,\qquad$n≥1$;
\penalty200\vskip2pt\textindent{c)}$\bslash x↓1,\ldotss,x↓{k-1},x↓k,0,x↓{k+1},
x↓{k+2},\ldotss,x↓n\bslash$
\penalty1000\rjustline{$=\bslash x↓1,\ldotss,x↓{k-1},x↓k+x↓{k+1},x↓{k+2},\ldotss,x↓n
\bslash$,\qquad$1≤k<n$;}
\penalty200\vskip2pt\textindent{d)}$1-\bslash x↓1,x↓2,\ldotss,x↓n\bslash=\bslash 1,
x↓1-1,x↓2,\ldotss,x↓n\bslash$,\qquad$n≥1$.
\exno 10. [M28] By the result of exercise 6, every irrational real number $X$ has
a unique regular continued-fraction representation of the form
$$X=A↓0+\bslash A↓1, A↓2, A↓3, \ldotss\bslash,$$
where $A↓0$ is an integer and $A↓1$, $A↓2$, $A↓3$, $\ldots$
are positive integers. Show that if $X$ has this representation
then the regular continued fraction for $1/X$ is
$$1/X = B↓0 + \bslash B↓1, \ldotss, B↓m, A↓5, A↓6,\ldotss\bslash$$
for suitable integers $B↓0$, $B↓1$, $\ldotss$, $B↓m$.\xskip
(The case $A↓0 < 0$ is, of course, the most interesting.)\xskip Explain
how to determine the $B$'s in terms of $A↓0$, $A↓1$, $A↓2$, $A↓3$,
and $A↓4$.
\exno 11. [M30] (J. Lagrange.)\xskip Let $X = A↓0 + \bslash A↓1,
A↓2,\ldotss\bslash$, $Y = B↓0 + \bslash B↓1, B↓2, \ldotss\bslash$
be the regular continued-fraction representations of two real
numbers $X$ and $Y$, in the sense of exercise 10. Show that
these representations ``eventually agree,'' in the sense that
$A↓{m+k} = B↓{n+k}$ for some $m$ and $n$ and for all $k ≥ 0$,
if and only if $X = (qY + r)/(sY + t)$ for some integers $q$,
$r$, $s$, $t$ with $|qt - rs| = 1$.\xskip (This theorem is the analog,
for continued-fraction representations, of the simple result
that the representations of $X$ and $Y$ in the decimal system
eventually agree if and only if $X = (10↑qY + r)/10↑s$ for some
integers $q$, $r$, and $s$.)
\trexno 12. [M30] A {\sl quadratic
irrationality} is a number of the form $(\sqrt{D} - U)/V$, where
$D$, $U$, and $V$ are integers, $D > 0$, $V ≠ 0$, and $D$ is not
a perfect square. We may assume without loss of generality that
$V$ is a divisor of $D - U↑2$, for otherwise the number may
be rewritten as $(\sqrt{DV↑2} - U|V|)/V|V|$.
\yskip\hang\textindent{a)}Prove that the regular continued fraction expansion
(in the sense of exercise 10) of a quadratic irrationality $X
= (\sqrt{D} - U)/V$ is obtained by the following formulas:
$$\baselineskip 14pt
\cpile{V↓0 = V,\qquad A↓0 = \lfloor X\rfloor ,\qquad U↓0 = U + A↓0V;\cr
V↓{n+1} = (D - U↓{\!n}↑2)/V↓n,\qquad A↓{n+1} = \lfloor
(\sqrt{D} + U↓n)/V↓{n+1}\rfloor,\cr
U↓{n+1}=A↓{n+1}V↓{n+1} - U↓n.\cr}$$
[{\sl Note:} An algorithm based on this process
has many applications to the solution of quadratic equations
in integers; see, for example, H. Davenport, {\sl The Higher
Arithmetic} (London: Hutchinson, 1952); W. J. LeVeque, {\sl
Topics in Number Theory} (Reading, Mass.: Addison-Wesley, 1956);
and see also Section 4.5.4. By exercise \hbox{1.2.4--35}, $A↓{n+1} = \lfloor
(\lfloor \sqrt{D}\rfloor + U↓n)/V↓{n+1}\rfloor$ when $V↓{n+1}
> 0$, and $A↓{n+1} = \lfloor (\lfloor \sqrt{D}\rfloor + 1 +
U↓n)/V↓{n+1}\rfloor$ when $V↓{n+1} < 0$; hence such an algorithm
need only work with the integer $\lfloor \sqrt{D}\rfloor$.]
\hang\textindent{b)}Prove that $0 < U↓n < \sqrt D$, $0 < V↓n < 2\sqrt D$,
for all $n > N$, where $N$ is some integer depending on $X$;
hence the regular continued-fraction representation of every
quadratic irrationality is eventually periodic.\xskip [{\sl Hint:}
Show that $(\sqrt{D} - U)/V = A↓0 + \bslash A↓1, \ldotss
, A↓n, -V↓n/(\sqrt{D} + U↓n)\bslash $, and use Eq.\ (5) to prove
that $(\sqrt{D} + U↓n)/V↓n$ is positive when $n$ is large.]
\hang\textindent{c)}Letting $p↓n = Q↓{n+1}(A↓0, A↓1, \ldotss , A↓n)$
and $q↓n=Q↓n(A↓1, \ldotss , A↓n)$, prove the identity $Vp↓n↑2 + 2Up↓nq↓n
+\biglp(U↑2 - D)/V\bigrp q↓n↑2 = (-1)↑{n+1}V↓{n+1}$.
\hang\textindent{d)}Prove that the regular continued-fraction representation
of an irrational number $X$ is eventually periodic if and only
if $X$ is a quadratic irrationality.\xskip (This is the continued
fraction analog of the fact that the decimal expansion of a
real number $X$ is eventually periodic if and only if $X$ is rational.)
\exno 13. [M40] (J. Lagrange, 1797.)\xskip Let $f(x)=a↓nx↑n+\cdots+a↓0$, $a↓n>0$,
be a polynomial with integer coefficients, having no rational roots, and having
exactly one real root $\xi>1$. Design computer program to find the first
thousand or so partial quotients of $\xi$, using the following algorithm
(which essentially involves only multiprecision addition):
\def\\#1. {\yskip\noindent\hbox to 38pt{\hfill\bf#1. }\hangindent38pt}
\\L1. Set $A ← 1$.
\\L2. For $k = 0$, 1, $\ldotss$, $n - 1$ (in this order),
and for $j = n - 1$, $\ldotss$, $k$ (in this order) set $a↓j ← a↓{j+1}
+ a↓j$.\xskip (This step replaces $f(x)$ by $g(x) = f(x + 1)$, a polynomial
whose roots are one less than those of $f$.)
\\L3. If $a↓n + a↓{n-1} + \cdots + a↓0 < 0$, set $A
← A + 1$ and return to L2.
\\L4. Output $A$ (which is the value of the next partial
quotient). Replace the coefficients $(a↓n, a↓{n-1}, \ldotss , a↓0)$ by $(-a↓0,
-a↓1, \ldotss , -a↓n)$ and return to L1.\xskip (This step replaces
$f(x)$ by a polynomial whose roots are reciprocals of those
of $f$.)
\yskip For example, starting with $f(x)
= x↑3 - 2$, the algorithm will output ``1'' $\biglp$changing $f(x)$ to
$x↑3 - 3x↑2 - 3x - 1\bigrp$; then ``3'' $\biglp$changing $f(x)$ to $10x↑3 -
6x↑2 - 6x - 1\bigrp$; etc.
\exno 14. [M22] (A. Hurwitz, 1891.)\xskip Show that
the following rules make it possible to find the regular continued
fraction expansion for $2X$, given the partial quotients of
$X$:
$$\baselineskip14pt\eqalign{2\bigbslash\, 2a, b, c,\ldotss\bigbslash⊗=\bigbslash\,a,
2b + 2\bslash c,\ldotss\bslash \bigbslash ;\cr
2\bigbslash\,2a + 1,b,c,\ldotsm\bigbslash⊗=\bigbslash\,a,1,1+2\bslash b-1,c,\ldotss
\bslash\bigbslash.\cr}$$
Use this idea to find the regular continued fraction
expansion of ${1\over 2}e$, given the expansion of $e$ in (13).
\trexno 15. [M31] (R. W. Gosper.)\xskip Generalizing exercise 14, design
an algorithm that computes the continued fraction $X↓0 + \bslash
X↓1, X↓2,\ldotss\bslash$ for $(ax + b)/(cx + d)$, given the
continued fraction $x↓0 + \bslash x↓1, x↓2,\ldotss\bslash$
for $x$, and given integers $a$, $b$, $c$, $d$ with $ad ≠ bc$. Make
your algorithm an ``on-line coroutine'' that outputs as many
$X↓k$ as possible before inputting each $x↓j$. Demonstrate how
your algorithm computes $(97x + 39)/(-62x - 25)$ when $x = -1
+ \bslash\hskip1pt 5, 1, 1, 1, 2, 1, 2\bslash $.
\exno 16. [HM30] (L. Euler, 1731.)\xskip Let $f↓0(z) = (e↑z - e↑{-z})/(e↑z
+ e↑{-z}) =\hbox{tanh}\,z$, and let $f↓{n+1}(z) = 1/f↓n(z) - (2n
+ 1)/z$. Prove that, for all $n$, $f↓n(z)$ is an analytic function
of the complex variable $z$ in a neighborhood of the origin,
and it satisfies the differential equation $f↓{\!n}↑\prime(z)
= 1 - f↓n(z)↑2 - 2nf↓n(z)/z$. Use this fact to prove that
$$\hbox{tanh}\,z = \bslash z↑{-1}, 3z↑{-1}, 5z↑{-1}, 7z↑{-1}, \ldotss
\bslash .$$
Then apply Hurwitz's rule (exercise 14) to prove
that
$$e↑{-1/n} =\bigbslash\,\overline{1, (2m + 1)n - 1, 1}\bigbslash ,\qquad
m ≥ 0.$$
$\biglp$This notation denotes the infinite continued fraction $\bslash\,1$,
$n-1$, 1, 1, $3n - 1$, 1, 1, $5n - 1$, 1, $\ldotss\bslash$.$\bigrp$
Also find the regular continued fraction expansion for $e↑{-2/n}$
when $n > 0$ is odd.
\trexno 17. [M23] (a) Prove that $\bslash x↓1, -x↓2\bslash =
\bslash x↓1 - 1, 1, x↓2 - 1\bslash $.\xskip (b) Generalize this
identity, obtaining a formula for $\bslash x↓1, -x↓2, x↓3,
-x↓4, \ldotss , x↓{2n-1}, -x↓{2n}\bslash$ in which all partial
quotients are positive integers when the $x$'s are large positive
integers.\xskip (c) The result of exercise 16 implies that $\tan 1 =
\bslash 1, -3, 5, -7,\ldotss\bslash$. Find the regular continued
fraction expansion of $\tan 1$.
%folio 473 galley 3 Bad spots. (C) Addison-Wesley 1978 *
\def\bslash{\char'477 } % boldface slash (vol. 2 only)
\exno 18. [M40] Develop
a computer program to find as many partial quotients of $x$ as possible, when $x$
is a real number given with high precision. Use this program
to calculate the first one thousand (or so) partial quotients
of Euler's constant $\gamma $, based on D. W. Sweeney's 3566-place
value [{\sl Math.\ Comp.\ \bf 17} (1963), 170--178].\xskip $\biglp$According to the
theory in the text,
we expect to get about 0.97 partial quotients per decimal digit.
Cf.\ Algorithm 4.5.2L and the article by J. W. Wrench, Jr.\ and
D. Shanks, {\sl Math.\ Comp.\ \bf 20} (1966), 444--447.$\bigrp$
\exno 19. [M20] Prove that $F(x) = \log↓b(1 + x)$ satisfies
Eq.\ (24).
\exno 20. [HM20] Derive (36) from (35).
\exno 21. [HM29] (E. Wirsing.)\xskip The bounds (37) were obtained
for a function $\varphi$ corresponding to $g$ with $Tg(x) =
1/(x + 1)$. Show that the function corresponding to $Tg(x)
= 1/(x + c)$ yields better bounds, when $c > 0$ is an appropriate
constant.
\exno 22. [HM46] (K. I. Babenko.)\xskip Develop efficient means to calculate
accurate approximations to the quantities
$λ↓j$ and $\Psi↓j(x)$ in (42), for small $j≥3$ and for $0≤x≤1$.
\exno 23. [HM23] Prove (51), using results from the proof of
Theorem W.
\exno 24. [M22] What is the average value of a partial quotient
$A↓n$ in the regular continued fraction expansion of a random
real number?
\exno 25. [HM25] Find an example of a set $\Iscr = I↓1 ∪
I↓2 ∪ I↓3 ∪ \cdots \subset [0, 1]$, where the $I$'s are disjoint
intervals, for which (43) does not hold.
\exno 26. [M23] Show that if the numbers $\{1/n, 2/n, \ldotss,
\lfloor n/2\rfloor /n\}$ are expressed as regular continued fractions,
the result is symmetric between left and right, in the sense
that $\bslash A↓t, \ldotss , A↓2, A↓1\bslash$ appears whenever
$\bslash A↓1, A↓2, \ldotss , A↓t\bslash$ does.
\exno 27. [M21] Derive (53) from (47) and (52).
\exno 28. [M23] Prove the following identities involving the
three number-theoretic functions $\varphi (n)$, $\mu (n)$, $\Lambdait
(n)$:
$$\vbox{\halign{\hbox to size{\qquad$\dispstyle#$\hfill}\cr
\hbox{a) }\sum ↓{d\rslash n} \mu (d) = \delta ↓{n1}.\hfill
\hbox{b) }\ln n = \sum ↓{d\rslash n}\Lambdait(d),\qquad n =
\sum ↓{d\rslash n} \varphi (d).\cr
\hbox{c) }\Lambdait(n) = \sum ↓{d\rslash n} \mu
\left(n\over d\right)\ln d,\qquad \varphi (n) = \sum ↓{d\rslash
n} \mu \left(n\over d\right)\,d.\cr}}$$
\exno 29. [M23] Assuming that $T↓n$
is given by (53), show that (55) = (56).
\trexno 30. [HM32] The following modification of Euclid's algorithm
is often suggested: Instead of replacing $v$ by $u \mod v$
during the division step, replace it by $|(u\mod v) - v|$
if $u \mod v > {1\over 2}v$. Thus, for example, if $u = 26$
and $v = 7$, we have $\gcd (26, 7) = \gcd (-2, 7) = \gcd (7, 2)$;
$-2$ is the {\sl remainder of smallest magnitude} when multiples of
7 are subtracted from 26. Compare this procedure with Euclid's
algorithm; estimate the number of division steps this method
saves, on the average.
\trexno 31. [M35] Find the ``worst case'' of the modification of Euclid's algorithm
suggested in exercise 30; what are the smallest inputs $u>v>0$ that require $n$
division steps?
\exno 32. [20] (a) A Morse code sequence of length
$n$ is a string of $r$ dots and $s$ dashes, where $r + 2s =
n$. For example, the Morse code sequences of length 4 are
$$\def\\{\vcenter{\hbox{---}}}
{\cdot}\,{\cdot}\,{\cdot}\,{\cdot},\quad {\cdot}\,{\cdot}\,\\,\quad
{\cdot}\,\\\,{\cdot},\quad\\\,{\cdot}\,{\cdot},\quad\\\,\\.$$
Noting that the continuant $Q↓4(x↓1,x↓2,x↓3,x↓4)=x↓1x↓2x↓3x↓4
+x↓1x↓2+x↓1x↓4+x↓3x↓4+1$, find and prove a simple relation between
$Q↓n(x↓1, \ldotss , x↓n)$ and Morse code sequences of length\penalty 1000\
$n$.\xskip (b) (L. Euler, {\sl Novi Comm.\ Acad.\ Sci.\ Pet.\ \bf 9}
(1762), 53--69.)\xskip Prove that $$\twoline{Q↓{m+n}(x↓1, \ldotss , x↓{m+n})
= Q↓m(x↓1, \ldotss , x↓m)Q↓n(x↓{m+1}, \ldotss , x↓{m+n})}{2pt}{\null + Q↓{m-1}(x↓1,
\ldotss , x↓{m-1})Q↓{n-1}(x↓{m+2}, \ldotss , x↓{m+n}).}$$
\exno 33. [M32] Let $h(n)$ be the number of representations
of $n$ in the form
$$\hbox to size{$n = xx↑\prime + yy↑\prime,\hfill x > y > 0,\hfill
x↑\prime > y↑\prime> 0,\hfill\gcd(x, y) = 1,$\hfill integer $x,x↑\prime,
y,y↑\prime.$}$$
(a) Show that if the conditions are relaxed to
allow $x↑\prime = y↑\prime $, the number of representations
is $h(n) + \lfloor (n - 1)/2\rfloor $.\xskip (b) Show that for fixed
$y > 0$ and $0 < t ≤ y$, where $\gcd(t, y) = 1$, and for each
fixed $x↑\prime$ such that $x↑\prime t ≡ n\modulo y$ and
$0 < x↑\prime < n/(y + t)$, there is exactly one representation
of $n$ satisfying the restrictions of (a) and the condition
$x ≡ t \modulo y$.\xskip (c) Consequently
$$h(n) = \sum \left\lceil\left({n\over y + t} - t↑\prime\right)\,{1\over
y}\right\rceil - \lfloor (n - 1)/2\rfloor,$$
where the sum is over all positive integers $y$,
$t$, $t↑\prime$ such that $\gcd(t, y) = 1$, $t ≤ y$, $t↑\prime ≤ y$,
$tt↑\prime ≡ n \modulo y$.\xskip (d) Show that each of the $h(n)$
representations can be expressed uniquely in the form
$$\baselineskip14pt
\eqalign{x⊗= Q↓m(x↓1, \ldotss , x↓m),\cr
x↑\prime⊗= Q↓k(x↓{m+1}, \ldotss , x↓{m+k})\,d,\cr}\qquad
\eqalign{y⊗= Q↓{m-1}(x↓1, \ldotss, x↓{m-1}),\cr
y↑\prime⊗= Q↓{k-1}(x↓{m+2}, \ldotss, x↓{m+k})\,d,\cr}$$
where $m$, $k$, $d$, and the $x↓j$ are positive integers
with $x↓1 ≥ 2$, $x↓{m+k} ≥ 2$, and $d$ is a divisor of $n$. The
identity of exercise 32 now implies that $n/d = Q↓{m+k}(x↓1,
\ldotss , x↓{m+k})$. Conversely, every sequence of positive integers
$x↓1$, $\ldotss$, $x↓{m+k}$ with $x↓1 ≥ 2$, $x↓{m+k} ≥ 2$, and $Q↓{m+k}(x↓1,
\ldotss , x↓{m+k})$ dividing $n$, corresponds in this way to
$m + k - 1$ representations of $n$.\xskip (e) Therefore $nT↓n = \lfloor
(5n - 3)/2\rfloor + 2h(n)$.
\exno 34. [HM40] (H. Heilbronn.)\xskip (a) Let $h↓d(n)$ be the number
of representations of $n$ as in exercise 33 such that $xd <
x↑\prime $, plus half the number of representations with $xd
= x↑\prime $. Let $g(n)$ be the number of representations without
the requirement $\gcd(x, y) = 1$. Prove that
$$h(n) = \sum ↓{d\rslash n} \mu (d)g\left(n\over d\right),\qquad
g(n) = 2 \sum ↓{d\rslash n} h↓d\left(n\over d\right).$$
(b) Generalizing exercise 33(b), show that for $d
≥ 1$, $h↓d(n) = \sum \biglp n/\biglp y(y + t)\bigrp\bigrp+ O(n)$, where the sum
is over all integers $y$ and $t$ such that $\gcd(t, y) = 1$ and $0<t
≤ y < \sqrt{n/d}$.\xskip
(c) Show that $\sum↓{1≤y≤n}\biglp y/(y+t)\bigrp= \varphi (y)
\ln 2 + O\biglp\sigma ↓{-1}(y)\bigrp$, where the sum is over the range
$0 < t ≤y$,$\gcd(t, y) = 1$;
and where $\sigma ↓{-1}(y) = \sum ↓{d\rslash y}(1/d)$.\xskip
(d) Show that $\sum↓{1≤y≤n}\varphi (y)/y↑2 = \sum
↓{1≤d≤n}\mu(d)H↓{\lfloor n/d\rfloor }/d↑2$.\xskip
(e) Hence $T↓n = \biglp(12 \ln 2)/π↑2\bigrp\biglp\ln n - \sum ↓{d\rslash n}
\Lambdait(d)/d\bigrp+O\biglp\sigma↓{-1}(n)↑2\bigrp$.
\exno 35. [HM41] (A. C. Yao and D. E. Knuth.)\xskip Prove that the sum of all
partial quotients for the fractions $m/n$, for $1≤m<n$, is equal to $2\biglp
\sum\lfloor x/y\rfloor+\lfloor n/2\rfloor\bigrp$, where the sum is over all
representations $n=xx↑\prime+yy↑\prime$ satisfying the conditions of exercise 33(a).
Show that $\sum\lfloor x/y\rfloor=3π↑{-2}n(\ln n)↑2+O\biglp n\log n\,(\log\log n)↑2
\bigrp$, and apply this to the ``ancient'' form of Euclid's algorithm that uses
only subtraction instead of division.
\exno 36. [M35] (G. H. Bradley.)\xskip
What is the smallest value of $u↓n$ such that the calculation of $\gcd(u↓1,\ldotss,
u↓n)$ by steps C1 and C2 in Section 4.5.2 requires $N$
divisions, if Euclid's algorithm is used throughout? Assume that
$N≥n$.
\exno 37. [M38] (T. S. Motzkin
and E. G. Straus.)\xskip Let $a↓1$, $\ldotss$, $a↓n$ be positive
integers. Show that $\max Q↓n(a↓{p(1)}, \ldotss , a↓{p(n)})$,
over all permutations $p(1) \ldotsm p(n)$ of $\{1, 2, \ldotss,n\}$,
occurs when $a↓{p(1)} ≥ a↓{p(n)} ≥ a↓{p(2)} ≥ a↓{p(n-1)} ≥ \cdotss
$; and the minimum occurs when $a↓{p(1)} ≤ a↓{p(n)} ≤ a↓{p(3)}
≤ a↓{p(n-2)} ≤ a↓{p(5)} ≤ \cdots ≤ a↓{p(6)} ≤ a↓{p(n-3)} ≤ a↓{p(4)}
≤ a↓{p(n-1)} ≤ a↓{p(2)}$.
\exno 38. [M25] (J. Mikusinski.)\xskip Let $K(n) = \max↓{m≥0}
T(m, n)$. Theorem F shows that $K(n) ≤ \lfloor\log↓\phi(\sqrt5\,n+1)\rfloor-2$;
prove that $K(n) ≥ {1\over 2}\lceil\log↓\phi
(\sqrt{5}\,n + 1)\rceil - 2$.
\trexno 39. [M25] (R. W. Gosper.)\xskip If a baseball player's batting
average is .334, what is the fewest possible number of times
he has been at bat?\xskip [Note for non-baseball-fans: Batting average = (number
of hits)/(times at bat), rounded to three decimal places.]
\trexno 40. [M28] ({\sl The Peirce tree.})\xskip Consider an infinite binary tree
in which each node is labeled with the faction $(p↓l+p↓r)/(q↓l+q↓r)$, where
$p↓l/q↓l$ is the label of the node's nearest left ancestor and $p↓r/q↓r$ is the
label of the node's nearest right ancestor.\xskip (A left ancestor is one that
precedes a node in symmetric order, while a right ancestor follows the node. See
Section 2.3.1 for the definition of symmetric order.)\xskip If the node has no
left ancestors, $p↓l/q↓l=0/1$; if it has no right ancestors, $p↓r/q↓r=1/0$.
Thus the label of the root is $1/1$; the labels of its two sons are 1/2 and 2/1;
the labels of the four nodes on level 2 are 1/2, 2/3, 3/2, and 3/1, from
left to right; the labels of the eight nodes on level 3 are 1/4, 2/5, 3/5, 3/4,
4/3, 5/3, 5/2, 4/1; and so on.
Prove that $p$ is relatively prime to $q$ in each label $p/q$; furthermore,
the node labeled $p/q$ precedes the node labeled $p↑\prime/q↑\prime$ in
symmetric order if and only if $p/q<p↑\prime/q↑\prime$. Find a connection
between the continued fraction for the label of a node and the path to that node,
thereby showing that each positive rational number appears as the label
of exactly one node in the tree.
\exno 41. [M25] Show that the function round$(x)$ needed in fixed-slash or
floating-slash arithmetic (exercise 4.5.1--12) can be computed rather
easily from the continued-fraction representation of $x$.
\exno 42. [HM48] Develop the analysis of algorithms for computing the gcd of
three or more integers.
%folio 476 galley 4 Mostly unreadable tape. (C) Addison-Wesley 1978 *
\runningrighthead{FACTORING INTO PRIMES}
\section{4.5.4}
\sectionskip
\sectionbegin{4.5.4. Factoring into Primes}
Several of the computational methods we have
encountered in this book rest on the fact that every positive
integer $n$ can be expressed in a unique way in the form
$$n = p↓1p↓2 \ldotsm p↓t,\qquad p↓1 ≤ p↓2 ≤ \cdots ≤ p↓t,\eqno(1)$$
where each $p↓k$ is prime.\xskip (When $n=1$, this equation holds for
$t=0$.)\xskip It is unfortunately not a simple matter to find this
prime factorization of $n$, or to determine whether or not $n$ is prime. So
far as anyone knows, it is a great deal harder to factor a large number $n$
than to compute the greatest common divisor of two large numbers $m$ and $n$;
therefore we should avoid factoring large numbers whenever possible. But several
ingenious ways to speed up the factoring process have been discovered, and we will
now investigate some of them.
\subsectionbegin{Divide and factor} First let us consider the most obvious
algorithm for factor\-ization: If $n>1$, we can divide $n$ by successive primes
$p=2$, 3, 5, $\ldots$ until discovering the smallest $p$ for which $n\mod p=0$.
Then $p$ is the smallest prime factor of $n$, and the same process may be applied
to $n←n/p$ in an attempt to divide this new value of $n$ by $p$ and by higher
primes. If at any stage we find that $n\mod p≠0$ but $\lfloor n/p\rfloor≤p$,
we can conclude that $n$ is prime; for if $n$ is not prime then by
(1) we must have $n ≥ p↓1↑2$, but $p↓1 > p$ implies that
$p↓1↑2 ≥ (p + 1)↑2 > p(p + 1) > p↑2 + (n \mod p)≥\lfloor
n/p\rfloor p + (n \mod p) = n$. This leads us to the following
procedure:
\algbegin Algorithm A (Factoring by division).
Given a positive integer $N$, this algorithm finds the prime
factors $p↓1 ≤ p↓2 ≤ \cdots ≤ p↓t$ of $N$ as in Eq.\ (1). The
method makes use of an auxiliary sequence of ``trial divisors''
$$2 = d↓0 < d↓1 < d↓2 < d↓3 < \cdotss,\eqno (2)$$
which includes all prime numbers $≤ \sqrt N$ (and
which may also include values that are {\sl not} prime, if
it is convenient to do so). The sequence of $d$'s must also
include at least one value such that $d↓k > \sqrt{N}$.
\algstep A1. [Initialize.] Set $t←0$, $k←0$, $n←N$.\xskip (During this algorithm
the variables $t$, $k$, $n$ are related by the following condition: ``$n=
N/p↓1\ldotsm p↓t$, and $n$ has no prime factors less than $d↓k$.'')
\topinsert{\vskip47mm
\ctrline{\caption Fig.\ 10. A simple factoring algorithm.}}
\algstep A2. [$n=1$?] If $n=1$, the algorithm terminates.
\algstep A3. [Divide.] Set $q←\lfloor n/d↓k\rfloor$, $r←n\mod d↓k$.\xskip (Here
$q$ and $r$ are the quotient and remainder obtained when $n$ is divided by $d↓k$.)
\algstep A4. [Zero remainder?] If $r≠0$, go to step A6.
\algstep A5. [Factor found.] Increase $t$ by 1, and set $p↓t←d↓k$, $n←q$. Return
to step A2.
\algstep A6. [Low quotient?] If $q>d↓k$, increase $k$ by 1 and return to step A3.
\algstep A7. [$n$ is prime.] Increase $t$ by 1, set $p↓t←n$, and terminate the
algorithm.\quad\blackslug
\yyskip As an example of Algorithm A\null, consider the factorization of the
number $N = 25852$. We
immediately find that $N = 2 \cdot 12926$; hence $p↓1 = 2$.
Furthermore, $12926 = 2 \cdot 6463$, so $p↓2 = 2$. But now $n
= 6463$ is not divisible by 2, 3, 5, $\ldotss$, 19; we find that
$n = 23 \cdot 281$, hence $p↓3 = 23$. Finally $281 = 12 \cdot
23 + 5$ and $12 ≤ 23$; hence $p↓4 = 281$. The determination of
25852's factors has therefore involved a total of 12 division operations; on the
other hand, if we had tried to factor the slightly smaller number
25849 (which is prime), at least 38 division operations would have been
performed. This illustrates the fact that Algorithm A requires
a running time roughly proportional to $\max(p↓{t-1},\sqrt{p↓t}\,)$.\xskip
(If $t=1$, this
formula is valid if we adopt the convention $p↓0=1$.)
The sequence $d↓0$, $d↓1$, $d↓2$, $\ldots$ of
trial divisors used in Algorithm A can be taken to
be simply 2, 3, 5, 7, 11, 13, 17, 19,
23, 25, 29, 31, 35, $\ldotss$, where we alternately add 2 and
4 after the first three terms. This sequence contains all numbers
that are not multiples of 2 or 3; it also includes numbers
such as 25, 35, 49, etc., which are not prime, but the algorithm
will still give the correct answer. A further savings of 20
percent in computation time can be made by removing the numbers
$30m \pm 5$ from the list for $m ≥ 1$, thereby eliminating all
of the spurious multiples of 5. The exclusion of multiples of
7 shortens the list by 14 percent more, etc. A compact bit table
can be used to govern the choice of trial divisors.
If $N$ is known to be small, it is reasonable to have a table of all the
necessary primes as part of the program. For example, if $N$ is less than a
million, we need only include the 168 primes less than a thousand (followed
by the value $d↓{168}=1000$, to terminate the list in case $n$ is a prime larger
than $997↑2$). Such a table can be set up by means of a short auxiliary program,
which builds the table just after the factoring program has been loaded into the
computer; see Algorithm 1.3.2P\null,
or see exercise 8.
How many trial divisions are necessary
in Algorithm A\null? Let $π(x)$ be the number of primes $≤ x$, so
that $π(2) = 1$, $π(10) = 4$; the asymptotic behavior of this
function has been studied extensively by many of the world's
greatest mathematicians, beginning with Legendre in 1798. After numerous
advances made during the next hundred years, Charles
de la Vall\'ee Poussin proved in 1899 that, for some $A
> 0$,
$$π(x) = \int ↑{x}↓{2} {dt\over\ln t} + O\biglp
xe↑{-A\sqrt{\,\log x}\,}\bigrp.\eqno(3)$$
[{\sl M\'em.\ Couronn\'es Acad.\ Roy.\ Belgique
\bf 59} (1899), 1--74.]\xskip Integrating by parts yields
$$π(x)={x\over\ln x}+{x\over(\ln x)↑2}+{2!\,x\over(\ln x)↑3}+\cdots
+{r!\,x\over(\ln x)↑{r+1}} + O\left(x\over(\log x)↑{r+2}\right)\eqno(4)$$
for all fixed $r ≥ 0$. The error term in (3) can
be improved, for example to $O\biglp x\exp\biglp-A(\log x)↑{3/5}/(\log
\log x)↑{1/5}\bigrp\bigrp$; see A. Walfisz, {\sl Weyl'sche Exponential\-summen
in der neueren Zahlentheorie} (Berlin, 1963), Chapter 5. Bernhard Riemann
conjectured in 1859 that
$$\quad π(x) = \sum ↓{k≥1}\mu(k)L\biglp\spose{\raise5pt\hbox
{\hskip2.5pt$\scriptscriptstyle k$}}\sqrt x\,\bigrp/k + O(1) = L(x) -
{1\over 2}L\biglp\sqrt{x}\,\bigrp- {1\over 3}L\biglp\spose{\raise5pt\hbox
{\hskip2.5pt$\scriptscriptstyle 3$}}\sqrt{x}\,\bigrp + \cdots\eqno (5)$$
where $L(x)=\int↓2↑x\,dt/\!\ln t$, and
his formula agrees well with actual counts when $x$ is of reasonable size.
For example, we have the following table:
$$\vbox{\halign{$\hfill#$⊗\quad$\hfill#$⊗\quad$\hfill#$⊗\quad$\hfill#$⊗\quad
\hfill#\cr
x⊗π(x)\quad⊗x/\!\ln x\quad⊗L(x)\quad⊗Reimann's formula\cr
\noalign{\vskip 2pt}
10↑3⊗168⊗144.8⊗176.6⊗168.36\cr
10↑6⊗78498⊗72382.4⊗78626.5⊗78527.40\cr
10↑9⊗50847534⊗48254942.4⊗50849233.9⊗50847455.43\cr}}$$
Actually Riemann's conjecture (5) was disproved by J. E.
Littlewood in 1914; see Hardy and Littlewood, {\sl Acta Math.\ \bf 41}
(1918), 119--196, where it is shown that there is a positive
constant $C$ such that $π(x) > L(x) + C\sqrt{x}\log\log\log
x/\!\log x$ for infinitely many $x$. But Riemann made
another much more plausible conjecture, the famous ``Riemann
hypothesis'' about the complex zeros of the zeta function; this
hypothesis, if true, would imply that $π(x) = L(x) + O\biglp\sqrt{x}\log x\bigrp$.
In order to analyze the average behavior of Algorithm A\null, we would like to
know how large the largest prime factor $p↓t$ will tend to be. This question
was first investigated by Karl Dickman [{\sl Arkiv f\"or Mat., Astron.,
och Fys.\ \bf 22A}, 10 (1930), 1--14], who studied the probability
that a random integer between 1 and $x$ will have its largest
prime factor $≤ x↑α$. Dickman gave a heuristic argument to
show that this probability approaches the limiting value $F(α)$
as $x → ∞$, where $F$ can be calculated from the functional
equation
$$F(α) = \int ↑{α}↓{0}F\left(t\over 1 - t\right)\,{dt\over t},\quad\hbox{for }
0 ≤ α ≤ 1;\qquad F(α) = 1\quad\hbox{for }α ≥ 1.\eqno (6)$$
His argument was essentially this: The number of
integers $≤x$ whose largest prime factor is between $x↑t$ and
$x↑{t+dt}$ is $xF↑\prime(t)\,dt$. The number of primes $p$ in that
range is $π(x↑{t+dt}) - π(x↑t) = π\biglp x↑t + (\ln x)x↑t\,dt\bigrp -
π(x↑t) = x↑t\,dt/t$. For every such $p$, the number of integers $n$ such that
``$np≤x$ and the largest prime factor of $n$ is $≤p$'' is
the number of $n≤x↑{1-t}$ whose largest prime factor is $≤(x↑{1-t})↑{t/(1-t)}$,
namely $x↑{1-t}F\biglp t/(1-t)\bigrp$. Hence $xF↑\prime(t)\,dt=(x↑t\,dt/t)\biglp
x↑{1-t}F\biglp(t/(1-t)\bigrp\bigrp$, and (6) follows by integration. This
heuristic argument can be made rigorous; V. Ramaswami [{\sl Bull.\ Amer.\
Math.\ Soc.\ \bf 55} (1949), 1122--1127] showed that the probability in
question for fixed $α$ is $F(α)+ O(1/\!\log x)$, as $x → ∞$,
and many other authors have extended the analysis [see the survey
by Karl K. Norton, {\sl Memoirs Amer.\ Math.\ Soc.\ \bf 106}
(1971), 9--27].
If ${1\over 2} ≤ α ≤ 1$, formula (6) simplifies to
$$F(α)\, =\,1 - \int ↑1↓αF\left(t\over1-t\right)\,{dt\over t}
\,=\,1-\int↓α↑1{dt\over t}\,=\,1+\ln α.$$
Thus, for example, the probability that a random positive integer $≤x$ has a prime
factor $>\sqrt x$ is $1-F({1\over2})=\ln2$, about 69 percent. In all such cases,
Algorithm A must work hard.
%folio 479 galley 5 Tape mostly unreadable. (C) Addison-Wesley 1978 *
The net result of this discussion
is that Algorithm A will give the answer rather quickly if we
want to factor a six-digit number; but for large $N$ the amount
of computer time for factorization by trial division will rapidly
exceed practical limits, unless we are unusually lucky.
Later in this section we will see that there are fairly good
ways to determine whether or not a reasonably large number $n$
is prime, without trying all divisors up to $\sqrt{n}$. Therefore
Algorithm A would often run faster if we inserted a primality
test between steps A2 and A3; the running time for this improved
algorithm would then be roughly proportional to $p↓{t-1}$, the {\sl
second-largest} prime factor of $N$, instead of to $\max(p↓{t-1},
\sqrt{\chop to 0pt{p↓t}}\,)$. By an argument analogous to Dickman's (see exercise
18), we can show that the second-largest prime factor of a random
integer\penalty1000\ $x$ will be $≤x↑β$ with approximate probability $G(β)$,
where
$$G(β) = \int ↑{β}↓{0}\left(G\left(t\over 1-t\right) - F\left(t\over1-t\right)
\right)\,{dt\over t},\quad\hbox{for }0 ≤ β≤\textstyle {1\over 2}.\eqno(7)$$
Clearly $G(β) = 1$ for $β ≥ {1\over 2}$.\xskip
(See Fig.\ 11.)\xskip Numerical evaluation of (6) and
(7) yields the following ``percentage points'':
$$\vbox{\:b\baselineskip14pt\halign to size{\hfill$#=$\tabskip0pt plus 10pt
⊗\hfill#\hfill⊗\hfill#\hfill⊗\hfill#\hfill⊗\hfill#\hfill⊗\hfill#\hfill⊗\hfill#\hfill
⊗\hfill#\hfill⊗\hfill#\hfill⊗\hfill#\hfill⊗\hfill#\hfill⊗\hfill#\hfill\tabskip0pt\cr
F(α), G(β)⊗.01⊗.05⊗.10⊗.20⊗.35⊗.50⊗.65⊗.80⊗.90⊗.95⊗.99\cr
α⊗.2697⊗.3348⊗.3785⊗.4430⊗.5220⊗.6065⊗.7047⊗.8187⊗.9048⊗.9512⊗.9900\cr
β⊗.0056⊗.0273⊗.0531⊗.1003⊗.1611⊗.2117⊗.2582⊗.3104⊗.3590⊗.3967⊗.4517\cr}}$$
Thus, the second-largest prime factor will be $≤x↑{.2117}$ about half the
time, etc.
\topinsert{\vskip 51mm
\baselineskip11pt\ctrline{\hbox to 205pt{\caption Fig.\ 11.
Probability distribution functions for the two largest prime factors of a
random integer $≤x$.}}}
The {\sl total number of prime factors}, $t$, has also been intensively
analyzed. Obviously $1≤t≤\lg N$, but these lower and upper bounds are
seldom achieved. It is possible to prove that a random integer between 1 and $x$
will have $t≤\ln\ln x+c\sqrt{\,\ln\ln x}$ with the limiting probability
$${1\over\sqrt{2π}}\int↓{-∞}↑c e↑{-u↑2/2}\,d↓{\null}u\eqno(8)$$
as $x→∞$, for any fixed $c$. In other words, the distribution of $t$ is
essentially normal, with mean and variance $\ln\ln x$; about 99.73 percent of
all large integers $≤x$ have $|t-\ln\ln x|≤3\sqrt{\,\ln\ln x}$. Furthermore
the average value of $t-\ln\ln x$ is known to be $$\gamma+\sum↓{p\,\,\hbox{\:d
prime}\,}
\biglp\ln(1-1/p)+1/(p-1)\bigrp=1.03465\ 38818\ 97438.$$[Cf.\ G. H. Hardy
and E. M. Wright, {\sl Introduction to the Theory of Numbers}, 4th ed.\
(Oxford, 1960), $\section$22.11; see also P. Erd\H os and M. Kac, {\sl Amer.\ J.
Math.\ \bf26} (1940), 738--742.]
The size of prime factors has a remarkable connection with permutations: The
average number of bits in the $k$th largest prime factor of a random $n$-bit
integer is asymptotically the same as the average length of the $k$th largest
cycle of a random $n$-element permutation, as $n→∞$.\xskip [See D. E. Knuth and
L. Trabb Pardo, {\sl Theoretical Comp.\ Sci.\ \bf3} (1976), 321--348.]\xskip
It follows that Algorithm A usually finds
a few small factors and then begins a long-drawn-out search for the big
ones that are left.
\subsectionbegin{Factoring \`a la Monte Carlo} Near the beginning of
Chapter 3, we observed that ``a random-number generator chosen at random isn't
very random.'' This principle, which worked against us in that chapter, has the
redeeming virtue that it leads to a surprisingly efficient method of
factorization, discoverd by J. M. Pollard [{\sl BIT \bf15} (1975), 331--334].
The number of computational steps in Pollard's method is on the order of
$\sqrt{\chop to 0pt{p↓{t-1}}}$,
so it is significantly faster than Algorithm A when $N$
is large. According to (7) and Fig.\ 11,
the running time will usually be well under $N↑{1/4}$.
Let $f(x)$ be any polynomial with integer coefficients, and consider the two
sequences defined by
$$x↓0=y↓0=A;\qquad x↓{m+1}=f(x↓m)\mod N,\qquad y↓{m+1}=f(y↓m)\mod p,\eqno(9)$$
where $p$ is any prime factor of $N$. It follows that
$$y↓m=x↓m\mod p,\qquad\hbox{for }m≥1.\eqno(10)$$
Now exercise 3.1--7 shows that we will have $y↓m=y↓{\,l(m)-1}$ for some
$m≥1$, where $l(m)$ is the least power of 2 that is $≤m$. Thus $x↓m-x↓{\,l(m)-1}$
will be a multiple of $p$. Furthermore if $f(y)\mod p$ behaves as a random mapping
from the set $\{0,1,\ldotss,p-1\}$ into itself, exercise 3.1--12 shows that the
average value of the least such $m$ will be of order $\sqrt{\chop to 0pt{p}}$.
In fact, exercise 4 below shows that this
average value for random mappings is less than $1.625\,Q(p)$, where
the function $Q(p)\approx\sqrt{πp/2}$ was defined in Section 1.2.11.3.
If the different prime divisors of $N$ correspond to different
values of $m$ (as they almost surely will, when $N$ is large), we will be able
to find them by calculating $\gcd(x↓m-x↓{\,l(m)-1},N)$ for $m=1$, 2, 3, $\ldotss$,
until the unfactored residue is prime.
From the theory in Chapter 3, we know that a linear polynomial
$f(x) = ax + c$ will not be sufficiently random for our
purposes. The next-simplest case is quadratic, say $f(x) = x↑2
+ 1$; although we don't {\sl know} that this function is sufficiently
random, our lack of knowledge tends to support the hypothesis
of randomness, and empirical tests show that this $f$ does work
essentially as predicted. In fact, $f$ is probably slightly
{\sl better} than random, since $x↑2 + 1$ takes on only ${1\over2}(p+1)$
distinct values mod $p$. Therefore the following procedure is reasonable:
\algbegin Algorithm B (Monte Carlo factorization).
This algorithm outputs the prime factors of a given integer $N≥2$, with
high probability, although there is a chance that it will fail.
\algstep B1. [Initialize.] Set $x←2$, $x↑\prime←5$, $k←1$, $l←1$, $n←N$.\xskip
$\biglp$During this algorithm, $n$ is the unfactored part of $N$, and $(x,x↑\prime)$
represents $(x↓{\,l(m)-1}\mod n,\,x↓m\mod n)$ in (9), where $f(x)=x↑2+1$, $A=1$,
$l=l(m)$, and $k=2l-m$.$\bigrp$
\algstep B2. [Test primality.] If $n$ is prime (see the discussion below),
output $n$; the algorithm terminates.
\algstep B3. [Factor found?] Set $g←\gcd(x↑\prime-x,\,n)$. If $g=1$, go on to step
B4; otherwise output $g$. Now if $g=n$, the algorithm terminates (and it has failed,
because we know that $n$ isn't prime). Otherwise set $n←n/g$, $x←x\mod n$,
$x↑\prime←x↑\prime\mod n$, and return to step B2.\xskip (Note that $g$ may not be
prime; this should be tested. In the rare event that $g$ isn't prime, its
prime factors probably won't be determinable with this algorithm unless some
changes are made as discussed below.)
\algstep B4. [Advance.] Set $k←k-1$. If $k=0$, set $x←x↑\prime$, $l←2l$, $k←l$.
Set $x↑\prime←(x↑2+1)\mod n$ and return to B3.\quad\blackslug
%folio 482 galley 6 Mostly hopeless. (C) Addison-Wesley 1978 *
\yyskip As an example of Algorithm B\null, let's try
to factor $N = 25852$ again. The third execution of step B3
will output $g = 4$ (which isn't prime). After six more iterations
the algorithm finds the factor $g = 23$.
Algorithm B has not distinguished itself in this example, but of course
it was designed to factor {\sl big} numbers. Algorithm A takes
much longer to find large prime factors, but it can't be beat
when it comes to removing the small ones. In practice, we should
run Algorithm A awhile before switching over to Algorithm B.
We can get a better idea of Algorithm B's prowess
by considering the ten largest six-digit primes. The number of iterations, $m(p)$,
that Algorithm B needs to find the factor $p$ is given in the following table:
$$\vbox{\baselineskip15pt\:b
\halign to size
{$\hfill#=$\tabskip0pt plus 10pt⊗\hfill#⊗\hfill#⊗\hfill#⊗\hfill#⊗\hfill#⊗\!
\hfill#⊗\hfill#⊗\hfill#⊗\hfill#⊗\hfill#\tabskip0pt\cr
p⊗999863⊗999883⊗999907⊗999917⊗999931⊗999953⊗999959⊗999961⊗999979⊗999983\cr
m(p)⊗276⊗409⊗2106⊗1561⊗1593⊗1091⊗474⊗1819⊗395⊗814\cr}}$$
Experiments indicate that $m(p)$ has an average value of about
$2\sqrt{\chop to 0pt{p}}$, and
it never exceeds $12\sqrt{\chop to 0pt{p}}$ when
$p<1000000$. The maximum $m(p)$ for $p<10↑6$ is
$m(874771)=7685$; and the maximum of $m(p)/\sqrt{\chop to 0pt{p}}$
occurs when $p=290047$,
$m(p)=6251$. According to these experimental results, almost all 12-digit numbers
can be factored in fewer than 2000 iterations of Algorithm B (compared to
roughly 100,000 divisions in Algorithm A).
The time-consuming operations in each
iteration of Algorithm B are the multiprecision multiplication
and division in step B4, and the gcd in step B3. If the gcd operation
is slow, Pollard suggests gaining speed by accumulating the
product mod $n$ of, say, ten consecutive $(x↑\prime - x)$ values
before taking each gcd; this replaces 90 percent of the gcd
operations by a single multiplication and division while only
slightly increasing the chance of failure. He also suggests
starting with $m=q$ instead of $m=1$ in step B1, where $q$
is say ${1\over 10}$ the number of iterations you are planning
to use.
In those rare cases where failure occurs for large $N$, we could try using
$f(x)=x↑2+c$ for some $c≠0$ or 1. The value $c=-2$ should also be avoided,
since the recurrence $x↓{m+1}=x↓m↑2-2$ has solutions of the form $x↓m=r↑{2↑m}+r↑{-
2↑m}$. Other values of $c$ do not seem to lead to simple relationships mod
$p$, and they should all be satisfactory when used with suitable starting values.
\subsectionbegin{Fermat's method} Another approach to the factoring
problem, which was used by Pierre de Fermat in 1643, is more
suited to finding large factors than small ones.\xskip [Fermat's description
of his method, translated into English, appears in L. E. Dickson's
{\sl History of the Theory of Numbers} {\bf 1} (New York: Chelsea,
1952), 357.]
Assume that $N = uv$, where $u
≤ v$. For practical purposes we may assume that $N$ is odd; this means that
$u$ and $v$ are odd. We can therefore let
$$\baselineskip15pt
\rpile{x=(u+v)/2,\cr N=x↑2-y↑2,\cr}\qquad\lpile{y=(v-u)/2,\cr 0≤y<x≤N.\cr}
\eqno\rpile{(11)\cr(12)\cr}$$
Fermat's method consists of searching for values of $x$ and $y$ that
satisfy this equation. The following algorithm shows how factoring
can therefore be done {\sl without using any division:}
\algbegin Algorithm C (Factoring by addition and subtraction).
Given an odd number $N$, this algorithm determines the largest
factor of $N$ less than or equal to $\sqrt{N}$.
\algstep C1. [Initialize.] Set $x↑\prime ← 2\lfloor \sqrt{N}\rfloor
+ 1$, $y↑\prime ← 1$, $r ← \lfloor \sqrt{N}\rfloor ↑2 - N$.\xskip (During
this algorithm $x↑\prime$, $y↑\prime$, $r$ correspond respectively
to $2x + 1$, $2y + 1$, $x↑2 - y↑2 - N$ as we search for a solution
to (12); we will have $|r| < x↑\prime$ and $y↑\prime < x↑\prime$.)
\algstep C2. [Test $r$.] If $r≤0$, go to step C4.
\algstep C3. [Step $y$.] Set $r←r-y↑\prime$, $y↑\prime←y↑\prime+2$, and
return to C2.
\algstep C4. [Done?] If $r=0$, the algorithm terminates; we have
$$N=\biglp(x↑\prime-y↑\prime)/2\bigrp\biglp(x↑\prime+y↑\prime-2)/2\bigrp,$$
and $(x↑\prime-y↑\prime)/2$ is the largest factor of $N$ less than or equal to
$\sqrt N$.
\algstep C5. [Step $x$.] Set $r←r+x↑\prime$, $x↑\prime←x↑\prime+2$, and
return to C3.\quad\blackslug
\yyskip
The reader may find it amusing to find the factors of 377 by hand, using this
algorithm. The number of steps needed to find the factors $u$ and $v$ of
$N=uv$ is essentially proportional to $(x↑\prime+y↑\prime-2)/2-\lfloor\sqrt N
\rfloor=v-\lfloor\sqrt N\rfloor$; this can, of course, be a very large
number, although each step can be done very rapidly on most computers. An
improvement that requires only $O(N↑{1/3})$ operations in the worst case has been
developed by R. S. Lehman [{\sl Math.\ Comp.\ \bf28} (1974), \hbox{637--646}].
%folio 488 galley 7 Total loss. (C) Addison-Wesley 1978 *
It is not quite correct
to call Algorithm C ``Fermat's method,'' since Fermat used a somewhat
more streamlined approach. Algorithm C's
main loop is quite fast on computers, but it is not very
suitable for hand calculation. Fermat actually did not keep
the running value of $y$; he would look at $x↑2 - N$ and tell
whether or not this quantity was a perfect square by looking at
its least significant digits.\xskip (The last two digits of a perfect
square must be 00, $a1$, $a4$, 25, $b6$, or $a9$, where $a$ is an even digit
and $b$ is an odd digit.)\xskip Therefore he avoided
the operations of steps C2 and C3, replacing them by an occasional
determination that a certain number is not a perfect square.
Fermat's method of looking at the rightmost digits can,
of course, be generalized by using other moduli. Suppose for
clarity that $N = 11111$, and consider the following table:
$$\vbox{\ninepoint\halign to size{\hfill#\tabskip0pt plus 10pt
⊗$#\hfill$⊗$#\hfill$⊗$#\hfill$\tabskip0pt\cr
$m$⊗\hbox{if $x\mod m$ is}⊗\hbox{then $x↑2\mod m$ is}⊗\hbox{and $(x↑2-N)
\mod m$ is}\cr\noalign{\vskip3pt}
3⊗0, 1, 2⊗0, 1, 1⊗1, 2, 2\cr
5⊗0, 1, 2, 3, 4⊗0, 1, 4, 4, 1⊗4, 0, 3, 3, 0\cr
7⊗0, 1, 2, 3, 4, 5, 6⊗0, 1, 4, 2, 2, 4, 1⊗5, 6, 2, 0, 0, 2, 6\cr
8⊗0, 1, 2, 3, 4, 5, 6, 7⊗0, 1, 4, 1, 0, 1, 4, 1⊗1, 2, 5,2,1,2,5,2\cr
11⊗0,1,2,3,4,5,6,7,8,9,10⊗0,1,4,9,5,3,3,5,9,4,1⊗10,0,3,8,4,2,2,4,8,3,0\cr}}$$
If $x↑2 - N$ is to
be a perfect square $y↑2$, it must have a residue mod $m$ consistent
with this fact, for all $m$. For example, if $N=11111$ and $x \mod 3 ≠ 0$,
then $(x↑2 - N)\mod 3 = 2$, so $x↑2 - N$ cannot
be a perfect square; therefore $x$ must be a multiple of 3 whenever
$11111 = x↑2 - y↑2$. The table tells us, in fact, that
$$\vcenter{\halign{$x\mod #\hfill=\null$⊗#\hfill\cr
3⊗0;\cr
5⊗0, 1, or 4;\cr
7⊗2, 3, 4, or 5;\cr
8⊗0 or 4 (hence $x \mod 4 = 0$);\cr
11⊗1, 2, 4, 7, 9, or 10.\cr}}\eqno(13)$$
This narrows down the search for $x$ considerably. For
example, $x$ must be a multiple of 12. We must have $x ≥ \lceil
\sqrt{N}\,\rceil = 106$, and it is easy to verify that the first
value of $x ≥ 106$ that satisfies all of the conditions in
(13) is $x = 144$. Now 144$↑2 - 11111 = 9625$, and by attempting to take
the square root of 9625 we find that it is not a square. The first
value of $x > 144$ that satisfies (13) is $x= 156$. In this
case $156↑2 - 11111 =13225=115↑2$; so we have found the desired solution $x
= 156$, $y = 115$. This calculation shows that $11111 = 41 \cdot
271$.
The hand calculations involved in the above example are comparable
to the amount of work required to divide 11111 by 13, 17, 19,
23, 29, 31, 37, and 41, even though the factors 41 and 271 are
not very close to each other; thus we can see the advantages
of Fermat's method.
In place of the moduli considered in (13), we can use any powers of distinct primes.
For example, if we had used 25 in place of 5, we would find
that the only permissible values of $x\mod 25$ are 0, 5, 6, 10, 15, 19, and 20.
This gives more information than (13). In general, we will get more information
modulo $p↑2$ than we do modulo $p$, for odd primes $p$, when $x↑2-N≡0\modulo p$ has
a solution $x$.
The modular method just used is called a
{\sl sieve procedure}, since we can imagine passing all integers
through a ``sieve'' for which only those values with $x\mod 3=0$
come out, then sifting these numbers through another sieve that allows only
numbers with $x\mod5=0$, 1, or 4 to pass, etc. Each sieve by itself will
remove about half of the remaining values (see exercise 6); and when
we sieve with respect to moduli that are relatively prime in
pairs, each sieve is independent of the others because of the
Chinese Remainder Theorem (Theorem 4.3.2C\null). So if we sieve with
respect to, say, 30 different primes, only about one value in
every $2↑{30}$ will need to be examined to see if $x↑2 - N$ is
a perfect square $y↑2$.
\algbegin Algorithm D (Factoring with sieves). Given an
odd number $N$, this algorithm determines the largest factor
of $N$ less than or equal to $\sqrt{N}$. The procedure uses
moduli $m↓1$, $m↓2$, $\ldotss$, $m↓r$, which are relatively prime
to each other in pairs and relatively prime to $N$. We assume
that $r$ ``sieve tables'' $S[i,j]$ for $0 ≤ j < m↓i$, $1 ≤
i ≤ r$, have been prepared, where
$$S[i, j] = \left\{\vcenter{\halign{#,\qquad⊗#\hfill\cr
1⊗if $j↑2-N≡y↑2\modulo{m↓i}$ has a solution $y$;\cr
0⊗otherwise.\cr}}\right.$$
\algstep D1. [Initialize.] Set $x←\lceil\sqrt{N}\,\rceil
$, and set $k↓i ← (-x)\mod m↓i$ for $1 ≤ i ≤ r$.\xskip (Throughout this
algorithm the index variables $k↓1$, $k↓2$, $\ldotss$, $k↓r$ will
be set so that $(-x) \mod m↓i = k↓i$.)
\algstep D2. [Sieve.] If $S[i, k↓i]=1$ for $1≤i≤r$, go to step D4.
\algstep D3. [Step $x$.] Set $x ← x+1$, and set $k↓i←(k↓i-1)\mod m↓i$ for
$1 ≤ i ≤ r$. Return to step D2.
\algstep D4. [Test $x↑2 - N$.] Set $y ← \lfloor \sqrt{x↑2 -
N}\rfloor$ or to $\lceil \sqrt{x↑2 - N}\,\rceil $. If $y↑2 =
x↑2 - N$, then $(x - y)$ is the desired factor, and the algorithm
terminates. Otherwise return to step D3.\quad\blackslug
\yyskip There
are several ways to make this procedure run fast. For example,
we have seen that if $N \mod 3 = 2$, then $x$ must be a multiple
of 3; we can set $x = 3x↑\prime $, and use a different sieve
corresponding to $x↑\prime $, increasing the speed threefold.
If $N\mod 9 = 1$, 4, or 7, then $x$ must be congruent respectively to
$\pm1$, $\pm2$, or $\pm4\modulo 9$; so we run two sieves (one for $x↑\prime$ and one
for $x↑{\prime\prime}$, where $x=9x↑\prime+a$ and $x=9x↑{\prime\prime}-a$) to
increase the speed by a factor of 4$1\over2$. If $N\mod4=3$, then $x\mod4$ is known
and the speed is increased by an additional
factor of 4; in the other case, when $N\mod 4 = 1$, $x$ must
be odd so the speed may be doubled. Another way to double the
speed of the algorithm (at the expense of storage space) is to combine pairs of
moduli, using $m↓{r-k\,}m↓k$ in place of $m↓k$ for $1≤k<{1\over2}r$.
An even more important method of speeding up Algorithm D is to use the
``Boolean operations'' found on most binary computers. Let us assume, for
example, that \MIX\ is a binary computer with 30 bits per word. The tables
$S[i,k↓i]$ can be kept in memory with one bit per entry; thus 30 values can
be stored in a single word. The operation \.{AND}, which replaces the $k$th bit
of the accumulator by zero if the $k$th bit of a specified word in memory is
zero, for $1≤k≤30$, can be used to process 30 values of $x$ at once! For
convenience, we can make several copies of the tables $S[i,j]$ so that the
table entries for $m↓i$ involve $\lcm(m↓i,30)$ bits; then the sieve tables for
each modulus fill an integral number of words. Under these assumptions, 30
executions of the main loop in Algorithm D are equivalent to code of the
following form:
{\yyskip\tabskip 60pt\mixthree{\!
D2⊗LD1⊗K1⊗$\rI1←k↓1↑\prime$.\cr
⊗LDA⊗S1,1⊗$\rA←S↑\prime[1,rI1]$.\cr
⊗DEC1⊗1⊗$\rI1←\rI1-1.$\cr
\\⊗J1NN⊗*+2\cr
⊗INC1⊗M1⊗If $\rI1<0$, set $\rI1←\rI1+\lcm(m↓1,30)$.\cr
⊗ST1⊗K1⊗$k↓1↑\prime←\rI1$.\cr
\\⊗LD1⊗K2⊗$\rI1←k↓2↑\prime$.\cr
⊗AND⊗S2,1⊗$\rA←\rA∧S↑\prime[2,\rI1]$.\cr
⊗DEC1⊗1⊗$\rI1←\rI1-1$.\cr
⊗J1NN⊗*+2\cr
⊗INC1⊗M2⊗if $\rI1<0$, set $\rI1←\rI1+\lcm(m↓2,30)$.\cr
⊗ST1⊗K2⊗$k↓2↑\prime←\rI1$.\cr
\\⊗LD1⊗K3⊗$\rI1←k↓3↑\prime$.\cr
⊗$\cdots$⊗⊗($m↓3$ through $m↓r$ are like $m↓2$)\cr
⊗ST1⊗Kr⊗$k↓r↑\prime←\rI1$.\cr
\\⊗INCX⊗30⊗$x←x+30$.\cr
⊗JAZ⊗D2⊗Repeat if all sieved out.\quad\blackslug\cr}}
\yyskip\noindent
The number of cycles for 30 iterations is essentially $2+8r$; if $r=11$ this
means three cycles are being used on each iteration, just as in Algorithm C\null,
and Algorithm C involves $y={1\over2}(v-u)$ more iterations.
%folio 492 galley 8 Bad spots. (C) Addison-Wesley 1978 *
If the table entries for $m↓i$ do
not come out to be an integral number of words, further shifting
of the table entries would be necessary on each iteration so
that the bits are aligned properly. This would add quite a lot
of coding to the main loop and it would probably make the program
too slow to compete with Algorithm C unless $v/u ≤ 100$ (see
exercise 7).
Sieve procedures can be applied to a variety of other problems,
not necessarily having much to do with arithmetic. A survey
of these techniques has been prepared by Marvin C. Wunderlich,
{\sl JACM} {\bf 14} (1967), 10--19.
Special sieve machines (of reasonably low cost) have been constructed
by D. H. Lehmer and his associates over a period of many years;
see, for example, {\sl AMM} {\bf 40} (1933), 401--406. Lehmer's
electronic delay-line sieve, which began operating in 1965, processes
one million numbers per second. Thus, each iteration of the
loop in Algorithm D can be performed in one microsecond on this
device. Another way to factor with sieves is described by D. H. and
Emma Lehmer in {\sl Math.\ Comp.\ \bf 28} (1974), 625--635.
\subsectionbegin{Primality testing} None of the algorithms
we have discussed so far is an efficient way to determine that
a large number $n$ is prime. Fortunately, there are other methods
available for settling this question; efficient methods
have been devised by \'E. Lucas and others,
notably D. H. Lehmer [see {\sl Bull.\ Amer.\ Math.\ Soc.\ \bf 33}
(1927), 327--340].
According to Fermat's theorem (Theorem
1.2.4F\null), we have $x↑{p-1}\mod p = 1$ whenever $p$ is prime and $x$ is
not a multiple of $≥p$. Furthermore, there are efficient ways
to calculate $x↑{n-1}\mod n$, requiring only $O(\log n)$
operations of multiplication mod $n$.\xskip (We shall study these
in Section 4.6.3 below.)\xskip Therefore we can often determine that
$n$ is {\sl not} prime when this relationship fails.
For example,
Fermat once verified that the numbers $2↑1 + 1$, $2↑2 + 1$, $2↑4 + 1$,
$2↑8 + 1$, and $2↑{16} + 1$ are prime. In a letter to Mersenne written
in 1640, Fermat conjectured that $2↑{2↑n}+ 1$ is always
prime, but said he was unable to determine definitely whether
the number $4294967297 = 2↑{32} + 1$ is prime or not. Neither Fermat nor Mersenne
ever resolved this problem, although they could have done it as follows: The
number $3↑{2↑{32}}\mod(2↑{32}+1)$ can be computed by doing 32 operations of squaring
modulo $2↑{32}+1$, and the answer is 3029026160; therefore (by Fermat's
own theorem, which he discovered in the
same year 1640!) the number $2↑{32} + 1$ is {\sl not} prime. This
argument gives us absolutely no idea what the factors are, but
it answers Fermat's question.
Fermat's theorem is a powerful test for showing non-primality of a given number.
When $n$ is not prime, it is always possible to
find a value of $x < n$ such that $x↑{n-1} \mod n ≠ 1$; experience
shows that, in fact, such a value can almost always be found very
quickly. There are some rare values of $n$ for which $x↑{n-1}
\mod n$ is frequently equal to unity, but then $n$ has a factor
less than $\spose{\raise5pt\hbox{\hskip2.5pt$\scriptscriptstyle3$}}\sqrt n$;
see exercise 9.
The same method can be extended to prove that a large prime
number $n$ really {\sl is} prime, by using the following idea:
{\sl If there is a number $x$ for which the order of $x$ modulo
$n$ is equal to $n - 1$, then $n$ is prime.}\xskip (The order of $x$ modulo
$n$ is the smallest positive integer $k$ such that $x↑k \mod
n = 1$; see Section 3.2.1.2.)\xskip For this condition implies that
the numbers $x↑k \mod n$ for $1 ≤ k ≤ n - 1$ are distinct and
relatively prime to $n$, so they must be the numbers 1, 2, $\ldotss$,
$n - 1$ in some order; thus $n$ has no proper divisors. If $n$
is prime, such a number $x$ (a ``primitive root'' of $n$) will
always exist; see exercise 3.2.1.2--16. In fact,
primitive roots are rather numerous. There are $\varphi
(n - 1)$ of them, and this is quite a substantial number, since
$n/\varphi(n-1)= O(\log\log n)$.
It is unnecessary to calculate $x↑k \mod n$ for all $k ≤
n - 1$ to determine if the order of $x$ is $n - 1$ or not. The
order of $x$ will be $n - 1$ if and only if
$$\vbox{\baselineskip14pt\halign{\qquad\hfill#⊗ $#\hfill$\cr
i)⊗x↑{n-1}\mod n=1;\cr
ii)⊗x↑{(n-1)/p}\mod n≠1\hbox{ for all primes $p$ that divide $n - 1$.}\cr}}$$
For $x↑s\mod n=1$ if and only if $s$ is
a multiple of the order of $x$ modulo $n$. If the two
conditions hold, and if $k$ is the order of $x$ modulo $n$,
we therefore know that $k$ is a divisor of $n - 1$, but not
a divisor of $(n - 1)/p$ for any prime factor $p$ of $n - 1$;
the only remaining possibility is $k=n - 1$. This completes the proof that
conditions (i) and (ii) suffice to establish the primality of $n$.
Exercise 10 shows that we can in fact use
different values of $x$ for each prime $p$, and $n$ will still
be prime. We may restrict consideration to primes $x$, since
the order of $uv$ modulo $n$ divides the least common multiple
of the orders of $u$ and $v$ by exercise 3.2.1.2--15. Conditions
(i) and (ii) can be tested efficiently by using the rapid methods
for evaluating powers of numbers discussed in Section 4.6.3.
But it is necessary to know the prime factors of $n - 1$, so
we have an interesting situation in which the factorization
of $n$ depends on that of $n - 1!$
\subsectionbegin{An example} The study of a reasonably typical
large factorization will help to fix the ideas we have discussed
so far. Let us try to find the prime factors of $2↑{214}+1$, a 65-digit
number. The factorization can be initiated with a bit of clairvoyance
if we notice that $$2↑{214} + 1 = (2↑{107} - 2↑{54}
+ 1)(2↑{107} + 2↑{54} + 1);\eqno (14)$$
this identity is a special case of some factorizations
discovered by A. Aurifeuille in 1873 [see Dickson's {\sl History},
{\bf 1}, p.\ 383]. The problem now boils down to examining each of the
33-digit factors in (14).
A computer program readily discovers that
$2↑{107} - 2↑{54} + 1 = 5 \cdot 857 \cdot n↓0$, where
$$n↓0 =37866809061660057264219253397\eqno(15)$$
is a 29-digit number having no prime factors less than 1000.
A multiple-precision calculation using the ``binary method'' of
Section 4.6.3 shows that
$$3↑{n↓0-1}\mod n↓0 = 1,$$
so we suspect that $n↓0$ is prime. It is certainly
out of the question to prove that $n↓0$ is prime by trying the
10 million million or so potential divisors, but the method discussed above
gives a feasible test for primality: our next goal is to factor $n↓0-1$. With little
difficulty, our computer will tell us that
$$n↓0-1 = 2 \cdot 2 \cdot 19 \cdot 107 \cdot 353 \cdot
n↓1,\qquad n↓1 = 13191270754108226049301.$$
Here $3↑{n↓1-1}≠1$, so $n↓1$ is not prime; by continuing
Algorithm A or Algorithm B we find $$n↓1 = 91813 \cdot n↓2,\qquad n↓2
= 143675413657196977.$$This time $3↑{n↓2-1}\mod n↓2
= 1$, so we will try to prove that $n↓2$ is prime. This requires
the factorization $n↓2 - 1 = 2 \cdot 2 \cdot 2 \cdot 2 \cdot
3 \cdot 3 \cdot 547 \cdot n↓3$, where $n↓3 = 1824032775457$. Since $3↑{n↓3-1}
\mod n↓3 ≠ 1$, we know that $n↓3$ is composite, and
Algorithm A finds that $n↓3 = 1103 \cdot n↓4$, where $n↓4 = 1653701519$.
The number $n↓4$ behaves like a prime (i.e., $3↑{n↓4-1}
\mod n↓4 = 1)$, so we calculate
$$n↓4 - 1 = 2 \cdot 7 \cdot 19 \cdot 23 \cdot 137 \cdot 1973.$$
Good; this is our first complete factorization. We are now
ready to backtrack to the previous subproblem, proving that
$n↓4$ is prime. Using the procedure suggested by exercise 10, we compute
the following values:
$$\vcenter{\halign{$\ctr{#}$⊗\qquad$\rt{#}$⊗\qquad$\ctr{#}$⊗\qquad$\ctr{#}$\cr
x⊗p⊗x↑{n↓4-1)/p}\mod n↓4\cr
\noalign{\vskip 3pt}
2⊗2⊗1⊗(1)\cr
2⊗7⊗766408626⊗(1)\cr
2⊗19⊗332952683⊗(1)\cr
2⊗23⊗1154237810⊗(1)\cr
2⊗137⊗373782186⊗(1)\cr
2⊗1973⊗490790919⊗(1)\cr
3⊗2⊗1⊗(1)\cr
5⊗2⊗1⊗(1)\cr
7⊗2⊗1653701518⊗1\cr}}\eqno(16)$$
(Here ``(1)'' means a result of 1 that needn't
be computed since it can be deduced from previous calculations.)
Thus $n↓4$ is prime, and $n↓2 - 1$ has been completely factored.
A similar calculation shows that $n↓2$ is prime, and this complete
factorization of $n↓0 - 1$ finally shows [after still another
calculation like (16)] that $n↓0$ is prime.
The next quantity to be factored is the other half of (14), namely
$$n↓5= 2↑{107} + 2↑{54} + 1.$$
Since $3↑{n↓5-1}\mod n↓5 ≠ 1$, we know
that $n↓5$ is not prime, and Algorithm B shows that $n↓5 = 843589
\cdot n↓6$, where $n↓6 = 192343993140277293096491917$. Unfortunately,
$3↑{n↓6-1} \mod n↓6 ≠ 1$, so we are left with a 27-digit
nonprime number. Continuing Algorithm B might well exhaust our
patience (not our budget---nobody is paying for this, we're
using idle time on a weekend). But the sieve method of Algorithm\penalty999\
D will be able to crack $n↓6$ into its two factors,
$$n↓6=8174912477117\cdot23528569104401.$$
This result could {\sl not} have been discovered by Algorithm A in a
reasonable length of time.\xskip (A few million iterations of Algorithm B would
probably have sufficed.)
Now the computation is complete: $2↑{214}+1$ has the prime factorization
$$5 \cdot 857 \cdot 843589 \cdot 8174912477117 \cdot 23528569104401
\cdot n↓0,$$
where $n↓0$ is the 29-digit prime in (15). A certain
amount of good fortune entered into these calculations, for
if we had not started with the known factorization (14) it is
quite probable that we would first have cast out the small factors,
reducing $n$ to $n↓6n↓0$. This 55-digit number would have
been much more difficult to factor---Algorithm D would be useless
and Algorithm B would have to work overtime because of the high
precision necessary.
Dozens of further numerical examples can
be found in an article by John Brillhart and J. L. Selfridge,
{\sl Math.\ Comp.\ \bf 21} (1967), 87--96.
%folio 496 galley 9 Bad spots. (C) Addison-Wesley 1978 *
\subsectionbegin{Improved primality tests} Since the
above procedure for proving that $n$ is prime requires the complete
factorization of $n - 1$, it will bog down for large $n$. Another
technique, which uses the factorization of $n + 1$ instead,
is described in exercise\penalty999\ 15; if $n - 1$ turns out to be too
hard, $n + 1$ might be easier.
Significant improvements are available for dealing with large $n$. For example,
Brillhart,
Lehmer, and Selfridge [{\sl Math.\ Comp.\ \bf 29} (1975), 620--647,
Corollary 11] have developed a method that works when
$n - 1$ and $n + 1$ have been only partially factored: Suppose $n-1=f↑-r↑-$ and
$n+1=f↑+r↑+$, where we know
the complete factorizations of $f↑-$ and $f↑+$, and we also know that
all factors of $r↑-$ and $r↑+$ are $≥b$. If the product $\biglp b↑3f↑-f↑+\max(f↑-,
f↑+)\bigrp$ is greater than $2n$, a small amount of additional computation,
described in their paper, will determine whether or not $n$
is prime. Therefore numbers of up to 35 digits can usually
be tested for primality in 2 or 3 seconds, simply by casting
out all prime factors $<30030$ from $n\pm 1$ [see J. L. Selfridge
and M. C. Wunderlich, {\sl Proc. Fourth Manitoba Conf. Numer.
Math.} (1974), 109--120]. The partial factorization of other
quantities like $n↑2 \pm n +1$ and $n↑2+1$ can be used to improve this method
still further [see H. C. Williams and J. S. Judd, {\sl Math.\ Comp.\ \bf30}
(1976), 157--172, 867--886].
In practice, when $n$ has no small prime
factors and $3↑{n-1}\mod n = 1$, it has almost always turned
out that $n$ is prime.\xskip $\biglp$One of the rare exceptions in the
author's experience is $n={1\over7}(2↑{28}-9)=2341\cdot16381$.$\bigrp$\xskip
On the other hand, some nonprime values of $n$ are definitely bad news for the
primality test we have discussed, because it might happen that $x↑{n-1}\mod n=1$
for all $x$ relatively prime to $n$ (see exercise 9). One such number is
$n = 3 \cdot 11 \cdot 17 = 561$; here $λ(n) = \lcm(2, 10, 16)
= 80$ in the notation of Eq.\ 3.2.1.2--9, so $x↑{80}\mod 561 =
1 = x↑{560}\mod 561$ whenever $x$ is relatively prime to 561.
Our procedure would repeatedly fail to show that such an $n$ is prime, until we
had stumbled across one of its divisors. To improve the method, we need a
quick way to determine the nonprimality of nonprime $n$, even in such
pathological cases.
The following simple procedure is guaranteed to do the job with high probability:
\algbegin Algorithm P (Probabilistic primality test). Given an odd integer $n$,
this algorithm attempts to decide whether or not $n$ is prime. By repeating
the algorithm several times, as explained in the remarks below, it is possible to
be extremely confident about the primality of $n$, in a precise sense, yet the
primality will not be rigorously proved. Let $n=1+2↑kq$, where $q$ is odd.
\algstep P1. [Generate $x$.] Let $x$ be a random integer in the range $1<x<n$.
\algstep P2. [Exponentiate.] Set $j←0$ and $y←x↑q\mod n$.\xskip (As in our previous
primality test, $x↑q\mod n$ should be calculated in $O(\log q)$ steps, cf.\
Section 4.6.3.)
\algstep P3. [Done?] (Now $y=x↑{2↑jq}\mod n$.)\xskip
If $j=0$ and $y=1$, or if $y=n-1$, terminate the algorithm
and say ``$n$ is probably prime.'' If $j>0$ and $y=1$, go to step P5.
\algstep P4. [Increase $j$.] Increase $j$ by 1. If $j<k$, set $y←y↑2\mod n$ and
return to step P3.
\algstep P5. [Not prime.] Terminate the algorithm and say ``$n$ is definitely
not prime.''\quad\blackslug
\yyskip The idea underlying Algorithm P is that if $n=1+2↑kq$ is prime and
$x↑q\mod n≠1$, the sequence of values
$$x↑{2q}\mod n,\quad x↑{4q}\mod n,\quad\ldotss,\quad x↑{2↑kq}\mod n$$
will end with 1, and the value just preceding the first appearance of 1 will be
$n-1$.\xskip$\biglp$The only solutions to $y↑2≡1\modulo p$ are $y≡\pm1$, when $p$
is prime, since $(y-1)(y+1)$ must be a multiple of $p$.$\bigrp$
Exercise 22 proves the basic fact that Algorithm P will be wrong at most $1\over4$
of the time, for all $n$. Actually it will rarely fail at all, for most $n$;
but the crucial point is that the probability of failure is bounded {\sl
regardless} of the value of $n$.
Suppose we invoke Algorithm P repeatedly, choosing $x$ independently and at random
whenever we get to step P1. If the algorithm ever reports that $n$ is nonprime,
we can say that $n$ definitely isn't prime. But if the algorithm reports 25
times in a row that $n$ is ``probably prime,'' we can say that $n$ is ``almost
surely prime.'' For the probability is less than $(1/4)↑{25}$ that such a
25-times-in-a-row procedure gives the wrong information about $n$. This is less
than one chance in a quadrillion; even if we certified a billion different primes
with such a procedure, the expected number of mistakes would be less than
$1\over1000000$. It's much more likely that our computer has dropped a bit in its
calculations, due to hardware malfunctions or cosmic radiations, than that
Algorithm P has repeatedly guessed wrong!
Probabilistic algorithms like this lead us to question our traditional standards
of reliability. Do we really {\sl need} to have a rigorous proof of primality?
For people unwilling to abandon traditional notions of proof, Gary L. Miller
has demonstrated that, if $\spose{\raise5pt\hbox{\hskip2.5pt$\scriptscriptstyle r
$}}\sqrt n$ is not an integer for any integer $r≥2$ (this condition being
easily checked), and if a certain well-known conjecture in number theory called
the Extended Riemann Hypothesis can be proved, then either $n$ is prime or
there is an $x<4(\ln n)↑2$ such that Algorithm P will discover the nonprimality
of $n$.\xskip [See {\sl J. Comp.\ System Sci.\ \bf 13} (1976), 300--317. The
constant 4 in this upper bound is due to Peter Weinberger, whose paper on the
subject
is not yet published.]\xskip Thus, we would have a rigorous way to test primality in
$O(\log n)↑5$ elementary operations, as opposed to a probabilistic method whose
running time is $O(\log n)↑3$. But one might well ask whether any purported
proof of the Extended Riemann Hypothesis will ever be as reliable as repeated
application of Algorithm\penalty999\ P on random $x$'s.
A probabilistic test for primality was first proposed in 1974 by R. Solovay and
V. Strassen, who devised the interesting but somewhat more complicated test
described in exercise 23(b).\xskip[See {\sl SIAM J. Computing \bf 6} (1977), 84--85;
{\bf7} (1978), 118.]\xskip Algorithm P is a simplified version of a procedure
due to M. O. Rabin, based in part on ideas of Gary L. Miller [cf.\ {\sl
Algorithms and Complexity}, ed.\ by J. F. Traub (New York: Academic Press, 1976),
35--36].
\subsectionbegin{Factoring via continued fractions} The factorization
procedures we have discussed so far will often balk at numbers
of 30 digits or more, and another idea is needed if we are to
go much further. Fortunately there is such an idea; in fact,
there were two ideas, due respectively to A. M. Legendre and
M. Kraitchik, which D. H. Lehmer and R. E. Powers used to devise
a new technique many years ago [{\sl Bull.\ Amer.\ Math.\ Soc.\
\bf 37} (1931), 770--776]. However, the method was not used
at that time because it was comparatively unsuitable for desk calculators. This
negative judgment prevailed until the late 1960s, when John Brillhart
found that the Lehmer--Powers approach deserved to be resurrected,
since it was quite well-suited to computer programming. In fact,
he and Michael A. Morrison later developed it into the current
champion of all methods for factoring large numbers: It handles
typical 25-digit numbers in about 30 seconds, and 40-digit numbers
in about 50 minutes, on an IBM 360/91 computer [see {\sl Math.\ Comp.\
\bf 29} (1975), 183--205]. In 1970 the method had its first
triumphant success, discovering that $2↑{128} + 1 = 59649589127497217
\cdot 5704689200685129054721$.
The basic idea is to search for numbers $x$ and $y$ such that
$$\quad x↑2 ≡ y↑2 \modulo N,\qquad 0 < x, y < N,\qquad x ≠ y,\qquad
x + y ≠ N.\eqno (17)$$
Fermat's method imposes the stronger requirement $x↑2-y↑2=N$, but
actually the congruence (17) is enough to split $N$ into factors:
It implies that $N$ is a divisor of $x↑2 - y↑2 = (x - y)(x +
y)$, yet $N$ divides neither $x - y$ nor $x + y$; hence $\gcd(N,
x - y)$ and $\gcd(N, x + y)$ are proper factors of $N$ that
can be found by Euclid's algorithm.
One way to discover solutions of (17) is to look for values of $x$ such that
$x↑2≡a\modulo N$, for small values of $|a|$. As we will see, it is often a simple
matter to piece together solutions of this congruence to obtain solutions of (17).
Now if $x↑2=a+kNd↑2$ for some $k$ and $d$, with small $|a|$, the fraction
$x/d$ is a good approximation to $\sqrt{kN}\,$; conversely, if $x/d$ is an
especially good approximation to $\sqrt{kN}$, the difference $|x↑2-kNd↑2|$ will
be small. This observation suggests looking at the continued fraction expansion
of $\sqrt{kN}$, since we have seen (Eq.\ 4.5.3--12) that continued fractions
yield good rational approximations.
Continued fractions for quadratic irrationalities have many pleasant prop\-er\-ties,
which are proved in exercise 4.5.3--12. The algorithm below makes use of these
properties to derive solutions to the congruence
$$x↑2≡(-1)↑{e↓0}p↓1↑{e↓1}p↓2↑{e↓2}\ldotss p↓m↑{e↓m}\;\modulo N.\eqno(18)$$
Here we use a fixed set of small primes $p↓1=2$, $p↓2=3$, $\ldotss$, up to
$p↓m$; only primes $p$ such that either $p=2$ or $(kN)↑{(p-1)/2}\mod p ≤1$ should
appear in this list, since other primes
will never be factors of the numbers generated by the algorithm
(see exercise 14). If $(x↓1, e↓{01}, e↓{11}, \ldotss , e↓{m1})$,
$\ldotss$, $(x↓r, e↓{0r}, e↓{1r}, \ldotss , e↓{mr})$ are solutions
of (18) such that the vector sum
$$(e↓{01}, e↓{11}, \ldotss , e↓{m1}) + \cdots + (e↓{0r}, e↓{1r},
\ldotss , e↓{mr}) = (2e↑\prime↓0,2e↑\prime↓1, \ldotss , 2e↑\prime↓m)\eqno(19)$$
is {\sl even} in each component, then
$$x = (x↓1 \ldotsm x↓m)\mod N,\qquad y = \biglp(-1)↑{e↓0↑\prime}p↑{e↓1↑\prime}↓{1}
\ldotss p↑{e↓{\!m}↑\prime}↓{m})\mod N\eqno (20)$$
yields a solution to (17), except for the possibility
that $x ≡ \pm y$. Condition (19) essentially says that the vectors
are linearly dependent modulo 2, so we must have a solution to (19) if we have
found at least $m+2$ solutions to (18).
%folio 500 galley 10 Bad spots. (C) Addison-Wesley 1978 *
\algbegin Algorithm E (Factoring via continued
fractions). Given a positive integer $N$ and a positive integer
$k$ such that $kN$ is not a perfect square, this algorithm attempts
to discover solutions to the congruence (18) for fixed $m$,
by analyzing the convergents to the continued fraction for $\sqrt{kN}$.\xskip
(Another algorithm, which uses the outputs to discover factors
of $N$, is the subject of exercise 12.)
\algstep E1. [Initialize.] Set $D ← kN$, $R
← \lfloor \sqrt{D}\rfloor$, $R↑\prime ← 2R$, $U ← U↑\prime ←R↑\prime
$, $V ← 1$, $V↑\prime ← D - R↑2$, $P ← R$, $P↑\prime ← 1$, $A ← 0$, $S ←
0$.\xskip (This algorithm follows the general procedure of exercise
4.5.3--12, finding the continued fraction expansion of $\sqrt{kN}$.
The variables $U$, $U↑\prime$, $V$, $V↑\prime$, $P$, $P↑\prime$, $A$,
and $S$ represent, respectively, what that exercise calls $R + U↓n$,
$R + U↓{n-1}$, $V↓n$, $V↓{n-1}$, $p↓n\mod N$, $p↓{n-1}\mod N$, $A↓n$,
and $n\mod 2$. We will always have $0 < V ≤ U ≤ R↑\prime $,
so the highest precision is needed only for $P$ and $P↑\prime$.)
\algstep E2. [Advance $U$, $V$, $S$.] Set $T ← V$, $V ← A(U↑\prime
- U) + V↑\prime$, $V↑\prime ← T$, $A ← \lfloor U/V\rfloor$, $U↑\prime
← U$, $U ← R↑\prime - (U\mod V)$, $S ← 1 - S$.
\algstep E3. [Factor $V$.] $\biglp$Now we have $P↑2 - kNQ↑2 = (-1)↑SV$,
for some $Q$ relatively prime to $P$, by exercise 4.5.3(c).$\bigrp$\xskip
Set $(e↓0, e↓1, \ldotss , e↓m) ← (S,0,\ldotss,0)$,
$T ← V$. Now do the following, for $1 ≤ j ≤ m$: If $T\mod
p↓j = 0$, set $T ← T/p↓j$ and $e↓j ← e↓j + 1$, and repeat this
process until $T\mod p↓j ≠ 0$.
\algstep E4. [Solution?] If $T = 1$, output the values $(P,
e↓0, e↓1, \ldotss , e↓m)$, which comprise a solution to (22).\xskip
(If enough solutions have been generated, we may terminate the
algorithm now.)
\algstep E5. [Advance $P$, $P↑\prime$.] If $V ≠ 1$ or $U ≠ R↑\prime
$, set $T ← P$, $P ← (AP + P↑\prime)\mod N$, $P↑\prime ← T$.
Otherwise the continued fraction process has started to repeat
its cycle, except perhaps for $S$, so the algorithm terminates.
$\biglp$The cycle will usually be so long that this doesn't happen,
unless $kN$ is nearly a perfect square.$\bigrp$\quad\blackslug
\yyskip We can illustrate the application
of Algorithm E to relatively small numbers by considering the
case $N = 197209$, $k = 1$, $m = 3$, $p↓1 = 2$, $p↓2 = 3$, $p↓3 = 5$.
The computation proceeds as follows:
$$\vbox{\halign to size{\hfill#\tabskip0pt plus 10pt
⊗$\rt{#}$⊗$\rt{#}$⊗$\rt{#}$⊗$\rt{#}$⊗$\rt{#}$⊗$\rt{#}$⊗$\lft{#}$\tabskip0pt\cr
⊗U\hfill⊗V\hfill⊗A\hfill⊗P\hfill⊗S⊗T⊗\hfill\hbox{Output}\cr
\noalign{\vskip 3pt}
After E1:⊗888⊗1⊗0⊗444⊗0⊗\hbox{---}\cr
After E4:⊗876⊗73⊗12⊗444⊗1⊗73\cr
After E4:⊗882⊗145⊗6⊗5329⊗0⊗29\cr
After E4:⊗857⊗37⊗23⊗32418⊗1⊗37\cr
After E4:⊗751⊗720⊗1⊗159316⊗0⊗1⊗159316↑2 ≡ +2↑4 \cdot 3↑2 \cdot 5↑1\cr
After E4:⊗852⊗143⊗5⊗191734⊗1⊗143\cr
After E4:⊗681⊗215⊗3⊗1319410⊗43\cr
After E4:⊗863⊗656⊗1⊗193139⊗1⊗41\cr
After E4:⊗883⊗33⊗26⊗127871⊗0⊗11\cr
After E4:⊗821⊗136⊗6⊗165232⊗1⊗17⊗\cr
After E4:⊗877⊗405⊗2⊗133218⊗0⊗1⊗133218↑2 ≡ +2↑0 \cdot 3↑4 \cdot 5↑1\cr
After E4:⊗875⊗24⊗36⊗37250⊗1⊗1⊗\937250↑2 ≡ -2↑3 \cdot 3↑1 \cdot 5↑0\cr
After E4:⊗490⊗477⊗1⊗93755⊗0⊗53\cr}}$$
Continuing the computation gives
25 outputs in the first 100 iterations; in other words, the
algorithm is finding solutions quite rapidly. But some of the solutions
are trivial. For example, if the above computation were continued
13 more times, we would obtain the output $197197↑2 ≡ 2↑4 \cdot
3↑2 \cdot 5↑0$, which is of no interest since $197197 ≡ -12$.
The first two solutions above are already enough to complete
the factorization: We have found that
$$(159316 \cdot 133218)↑2 ≡ (2↑2 \cdot 3↑3 \cdot 5↑1)↑2\;\modulo{197209};$$
thus (17) holds with $x = (159316 \cdot 133218)
\mod 197209 = 126308$, $y = 540$. By Euclid's algorithm, $\gcd(126308
- 540, 197209) = 199$; hence we obtain the pretty factorization
$$197209 = 199 \cdot 991.$$
Algorithm E begins its attempt to factorize $N$ by essentially replacing $N$ by
$kN$, and
this is a rather curious way to proceed (if not downright stupid). Nevertheless,
it turns out to be a good idea, since certain values of $k$ will make the $V$
numbers divisible by more small primes, hence they will be more likely to
factor completely in step E3. On the other hand, a large value of $k$ will make
the $V$ numbers larger, hence they will be less likely to factor completely;
we want to balance these tendencies by choosing $k$ wisely. Consider, for example,
the divisibility of $V$ by powers of 5. We have $P↑2
- kNQ↑2 = (-1)↑SV$ in step
E3, so if 5 divides $V$ we have $P↑2 ≡ kNQ↑2 \modulo 5$. In this congruence
$Q$ cannot be a multiple of 5, since it is relatively prime
to $P$, so we may write $(P/Q)↑2 ≡ kN\modulo 5$. If we assume that $P$
and $Q$ are random relatively prime integers, so that the 24
possibilities of $(P \mod 5, Q \mod 5) ≠ (0, 0)$ are equally
likely, the probability that 5 divides $V$ is therefore ${4\over
24}$, ${8\over 24}$, 0, 0, or ${8\over 24}$ according as $kN
\mod 5$ is 0, 1, 2, 3, or 4. Similarly the probability that 25 divides
$V$ is 0, ${40\over 600}$, 0, 0, ${40\over 600}$ respectively,
unless $kN$ is a multiple of 25. In general, given an odd prime
$p$ with $(kN)↑{(p-1)/2}\mod p=1$, we find that $V$ is a multiple
of $p↑e$ with probability $2/\biglp p↑{e-1}(p + 1)\bigrp$; and the average
number of times $p$ divides $V$ comes to $2p/(p↑2 - 1)$. This
analysis, suggested by R. Schroeppel, suggests that the best
choice of $k$ is that which maximizes
$$\chop to 12pt{\sum↓{p\,\,\hbox{\:d prime}\,}f(p, kN)\log p -
\textstyle{1\over 2}\log k,}\eqno(21)$$
where $f$ is the function defined in exercise 13,
for this is essentially the expected value of the logarithm
of $\sqrt{N}/T$ when we reach step E4.
Best results will be obtained with Algorithm E when both $k$ and $m$ are
well chosen. But before we study the choice of $m$, let us consider an important
refinement of the algorithm: Comparing step E3 with Algorithm
A\null, we see that the factoring of $V$ can stop whenever we find
$T\mod p↓j ≠ 0$ and $\lfloor T/p↓j\rfloor ≤ p↓j$, since $T$
will then be either 1 or prime. If $T$ is a prime greater than $p↓m$
(it will be at most $p↓m↑2 + p↓m - 1$ in such a case), we can still output
$(P, e↓0, \ldotss , e↓m, T)$, since a complete factorization
has been obtained. The second phase of the algorithm will use
only those outputs whose prime $T$'s have occurred at least
twice. This modification gives the effect of a much longer list
of primes, without increasing the factorization time.
Now let's make a heuristic analysis of
the running time of Algorithm E\null, following unpublished ideas
of R. Schroeppel. We will assume for convenience that $k = 1$.
The number of outputs needed to produce a factorization of $N$,
using the modification in the preceding paragraph, will be roughly proportional
to the number of suitable primes less than $p↓m↑2$; this will be
of order $m↑2\log m$, but let's say it is $m↑2$.
Each execution of step E3 will take about order $m$ units
of time; and if we assume that $V$ is randomly distributed between
0 and $2\sqrt{N}$ our chance of a successful output per iteration
will be approximately $F\biglp(\log p↓m↑2)/(\log 2\sqrt{N}\,)\bigrp$,
where $F$ is Dickman's function of Fig.\ 11 and Eq.\ (6).
Under these assumptions, the total
running time is roughly proportional to
$$m↑3/\,F(1/α),\qquad α = (\log 2\sqrt{N}\,)/(\log p↓m↑2).\eqno(22)$$
It is possible to show that $F(1/α) =\exp\biglp-α
\ln α + O(α\log\log α)\bigrp$ as $α→∞$;
in fact, N. G. de Bruijn [{\sl J. Indian Math.\ Soc.\ \bf 15}
(1951), 25--32] has obtained a much sharper estimate. If we
now choose$$\ln m = \sqrt{(\ln N)(\ln\ln N)/24}$$we find that
(22) becomes $$\exp\biglp\sqrt{\textstyle{3\over 2}(\ln N)(\ln\ln N)}\;+\;O\biglp
(\log N)↑{1/2}(\log\log N)↑{-1/2}(\log\log\log N)\bigrp\bigrp.$$
Stating this another way, the running time is $N↑{ε(N)}$, where
$$ε(N)\approx\sqrt{{3\over 2}{\ln\ln N\over\ln N}}$$ goes to 0 as $N → ∞$.
These asymptotic formulas are too crude to be applied for $N$
in a practical range, however; some extensive tests by M. L. Wunderlich indicate
that $m=150$ is a nearly optimum value, when $10↑{30}<N<10↑{41}$.
Since step E3 is by far the most time-consuming part of the
algorithm, Morrison, Brillhart, and Schroeppel have suggested
several ways to abort this step when success becomes improbable:\xskip
(a) Whenever $T$ changes to a single-precision value, continue
only if $\lfloor T/p↓j\rfloor > p↓j$ and $3↑{T-1}\mod T ≠
1$.\xskip (b) Give up if $T$ is still $>p↓m↑2$ after casting out factors
$<{1\over 10}p↓m$.\xskip (c) Cast out factors only up to $p↓5$,
say, for batches of 100 or so consecutive $V$'s; continue the
factorization later, but only on the $V$ from each batch that
has produced the smallest residual $T$.
For estimates of the cycle length in the output of Algorithm E, see
D. R. Hickerson, {\sl Pacific J. Math.\ \bf 46} (1973),
429--432; D. Shanks, {\sl Proc.\ Boulder Number Theory Conference}
(Univ.\ of Colorado: 1972), 217--224.
\subsectionbegin{Other approaches} A completely different
method of factorization, based on composition of binary quadratic
forms, has been introduced by Daniel Shanks [{\sl Proc.\ Symp.\
Pure Math.\ \bf 20} (1971), 415--440]. Like Algorithm B, it
will factor $N$ in $O(N↑{(1/4)+ε})$ steps except under
wildly improbable circumstances.
Still another important technique has been suggested by John
M. Pollard [{\sl Proc.\ Cambridge Phil.\ Soc.\ \bf 76} (1974),
521--528]. He obtains rigorous worst-case bounds of $O(N↑{(1/4)+ε
})$ for factorization and $O(n↑{(1/8)+ε})$ for primality
testing, but with impracticably high coefficients of proportionality;
and he also gives a practical algorithm for discovering prime
factors $p$ of $N$ when $p - 1$ has no large prime factors.
The latter algorithm (see exercise 19) is probably the first thing to try
after Algorithms A and B have run too long on a large $N$.
See also the survey by R. K. Guy, written in collaboration with J. H. Conway,
{\sl Proc. Fifth Manitoba Conf.\ Numer.\ Math.} (1975), 49--89.
%folio 505 galley 11 (C) Addison-Wesley 1978 *
\subsectionbegin{The largest known primes} We have discussed several
computational methods elsewhere in this book that require the
use of large prime numbers, and the techniques just described
can be used to discover primes of up to, say, 25 digits
or fewer, with relative ease. Table 1 shows the ten largest primes
that are less than the word size of typical computers.\xskip (Some
other useful primes appear in the answer to exercise 4.6.4--14.)
\topinsert{\vbox to 520pt{\hbox{(Table 1 will go on this page,
it's being set separately)}}}
Actually much larger primes of special forms are known, and
it is occasionally important to find primes that are as large
as possible. Let us therefore conclude this section by investigating
the interesting manner in which the largest explicitly known
primes have been discovered. Such primes are of the form $2↑n
- 1$, for various special values of $n$, and so they are especially
suited to certain applications of binary computers.
%folio 508 galley 12 Bad beginning. (C) Addison-Wesley 1978 *
\def\\#1{\sqrt{\hskip1pt\lower1pt\null#1}}
A number of the form $2↑n - 1$ cannot
be prime unless $n$ is prime, since $2↑{uv} - 1$ is divisible
by $2↑u - 1$. In 1644, Marin Mersenne astonished his contemporaries
by stating, in essence, that the numbers $2↑p - 1$ are prime
for $p = 2$, 3, 5, 7, 13, 17, 19, 31, 67, 127, 257, and for
no other $p$ less than 257.\xskip (This statement appeared in connection with a
discussion of perfect numbers in the preface to his {\sl Cogitata
Physico-Mathematics}. Curiously, he also made the following remark:
``To tell if a given number of 15 or 20 digits is prime or not, all time would not
suffice for the test, whatever use is made of what is already known.'')\xskip
Mersenne, who had corresponded frequently with Fermat, Descartes, and others about
similar topics in previous years, gave no proof of his assertions, and for
over 200 years nobody knew whether he was correct or not. Euler showed that
$2↑{31}-1$ is prime in 1772, after having tried unsuccessfully to prove this
in previous years. About 100 years later, \'E. Lucas discovered that
$2↑{127}-1$ is prime, but $2↑{67}-1$ is not; therefore Mersenne was not
completely accurate. Then I. M. Pervushin proved in 1883 that $2↑{61}-1$ is
prime [cf.\ {\sl Istoriko-Mat.\ Issledovani\t\i a \bf6} (1953), 559], and this
touched off speculation that Mersenne had only made a copying
error, writing 67 for 61. Eventually other errors in Mersenne's
statement were discovered; R. E. Powers [{\sl AMM \bf 18}
(1911), 195] found that $2↑{89} - 1$ is prime, as had been conjectured
by some earlier writers, and three years later he proved that
$2↑{107} - 1$ also is prime.\xskip M. Kraitchik showed in 1922 that
$2↑{257}- 1$ is {\sl not} prime.
At any rate, numbers of the form $2↑p - 1$ are now known as {\sl Mersenne numbers},
and it is known
that the first 25 Mersenne primes are obtained for $p$ equal to
$$\baselineskip15pt
\cpile{2,\,3,\,5,\,7,\,13,\,17,\,19,\,31,\,61,\,89,\,107,\,127,\,521,\,607,
\,1279,\cr
2203,\,2281,\,3217,\,4253,\,4423,\,9689,\,9941,\,11213,\,19937,\,21701.\cr}
\eqno(23)$$
The 24th of these was found by Bryant Tuckerman
[{\sl Proc.\ Nat.\ Acad.\ Sci.\ \bf 68} (1971), 2319--2320],
and the 25th was found in 1978 by Laura Nickel and Curt Noll.\xskip
(Note that $8191 = 2↑{13} - 1$ does not occur in
this list; Mersenne had stated that $2↑{8191} - 1$ is prime, and
others had conjectured that any Mersenne prime could perhaps
be used in the exponent.)
Since $2↑{21701} - 1$ is a 6533-digit number, it is clear that
some special techniques have been used to prove that it is prime.
An efficient method for testing the primality of a given Mersenne number
$2↑p - 1$ was first devised by \'E. Lucas [{\sl Amer.\ J. Math.\ \bf 1}
(1878), 184--239, 289--321, especially p.\ 316] and improved
by D. H. Lehmer [{\sl Annals of Math.\ \bf 31} (1930), 419--448,
especially p.\ 443]. The Lucas-Lehmer test, which is a special
case of the method now used for testing the primality of $n$
when the factors of $n + 1$ are known, is the following:
\thbegin Theorem L. {\sl Let $q$ be an odd prime, and
define the sequence $\langle L↓n\rangle$ by the rule
$$L↓0 = 4,\qquad L↓{n+1} = (L↓n↑2 - 2)\mod (2↑q - 1).\eqno (24)$$
Then $2↑q - 1$ is prime if and only if $L↓{q-2} = 0$.}
\yyskip For example, $2↑3 - 1$ is prime since $L↓1
= (4↑2 - 2)\mod 7 = 0$. This test is particularly well suited
to binary computers, using multiple-precision arithmetic when $q$
is large, since calculation mod $(2↑q - 1)$ is so convenient;
cf.\ Section 4.3.2.
\proofbegin We will prove Theorem L using only
very simple principles of number theory, by investigating several
features of recurring sequences that are of independent interest.
Consider the sequences $\langle U↓n\rangle$ and $\langle V↓n\rangle$
defined by
$$\baselineskip 15pt
\eqalign{U↓0⊗= 0,\cr V↓0⊗=2,\cr}\qquad
\eqalign{U↓1⊗= 1,\cr V↓1⊗=4,\cr}\qquad
\eqalign{U↓{n+1}⊗=4U↓n - U↓{n-1};\cr
V↓{n+1}⊗= 4V↓n - V↓{n-1}.\cr}\eqno(25)$$
The following equations are readily proved
by induction:
$$\baselineskip15pt
\eqalignno{V↓n⊗ = U↓{n+1} - U↓{n-1};⊗(26)\cr
U↓n ⊗=\biglp(2 + \\3\,)↑n - (2 - \\3\,)↑n\bigrp/\\{12};⊗(27)\cr
V↓n ⊗= (2 + \\3\,)↑n + (2 - \\3\,)↑n;⊗(28)\cr
U↓{m+n} ⊗=U↓mU↓n- U↓{m-1}U↓n.⊗(29)\cr}$$
Let us now prove an auxiliary result, when $p$ is prime and $e ≥ 1:$
$$\hbox{if}\qquad U↓n ≡ 0 \modulo {p↑e}\qquad\hbox{then}\qquad U↓{np}
≡ 0 \modulo {p↑{e+1}}.\eqno (30)$$
This follows from
the more general considerations of exercise 3.2.2--11, but a
simple proof for this case can be given. Assume that $U↓n =
bp↑e$, $U↓{n+1} = a$. By (29) and (25), $U↓{2n} = bp↑e(2a - 4bp↑e)
≡ (2a)U↓n\modulo{p↑{e+1}}$, while $U↓{2n+1} = U↑{2}↓{n+1}
- U↓n↑2 ≡ a↑2$. Similarly, $U↓{3n} = U↓{2n+1}U↓n - U↓{2n}U↓{n-1}≡
(3a↑2)U↓n$ and $U↓{3n+1}=U↓{2n+1}U↓{n+1}-U↓{2n}U↓n≡a↑3$. In general,
$$U↓{kn}≡(ka↑{k-1})U↓n\qquad\hbox{and}\qquad U↓{kn+1}≡a↑k\qquad\modulo{p↑{e+1}},$$
so (30) follows if we take $k = p$.
From formulas (27) and (28) we can obtain
other expressions for $U↓n$ and $V↓n$, expanding $(2 \pm \\3\,)↑n$
by the binomial theorem:
$$\chop to 12pt{U↓n = \sum ↓{k}{n\choose2k + 1}\,2↑{n-2k-1}3↑k,\qquad
V↓n = \sum ↓{k}{n\choose2k}\,2↑{n-2k+1}3↑k.}\eqno (31)$$
Now if we set $n = p$, where $p$ is an odd prime,
and if we use the fact that $p\choose k$ is a multiple of
$p$ except when $k = 0$ or $k = p$, we find that
$$U↓p ≡ 3↑{(p-1)/2},\qquad V↓p ≡ 4\qquad \modulo p.\eqno(32)$$
If $p ≠ 3$, Fermat's theorem tells us that $3↑{p-1}
≡ 1$; hence $(3↑{(p-1)/2}- 1) \cdot (3↑{(p-1)/2}+1)≡0$, and $3↑{(p-1)/2}≡\pm1$.
When $U↓p ≡ -1$,
we have $U↓{p+1} = 4U↓p - U↓{p-1} = 4U↓p + V↓p - U↓{p+1} ≡ -U↓{p+1}$;
hence $U↓{p+1}\mod p = 0$. When $U↓p ≡ +1$, we have $U↓{p-1}
= 4U↓p - U↓{p+1} = 4U↓p - V↓p - U↓{p-1} ≡ -U↓{p-1}$; hence $U↓{p-1}
\mod p = 0$. We have proved that, for all primes $p$,
there is an integer $ε(p)$ such that
$$U↓{p+ε(p)}\mod p=0, \qquad |ε(p)|≤1.\eqno(33)$$
Now if $N$ is any positive integer, and if $m=m(N)$ is the smallest positive
integer such that $U↓{m(N)}\mod N = 0$, we have
$$U↓n \mod N = 0\qquad\hbox{if and only if\qquad$n$ is a multiple
of $m(N)$.}\eqno (34)$$
(This number $m(N)$ is called the ``rank of apparition''
of $N$ in the sequence.) To prove (34), observe that the sequence
$U↓m$, $U↓{m+1}$, $U↓{m+2}$, $\ldots$ is congruent modulo $N$ to
$aU↓0$, $aU↓1$, $aU↓2$, $\ldotss$, where $a = U↓{m+1}\mod N$ is
relatively prime to $N$ because $\gcd(U↓n, U↓{n+1}) = 1$.
With these preliminaries out of the way, we are ready to prove
Theorem L\null. By (24) and induction,
$$L↓n = V↓{2↑n}\mod (2↑q - 1).\eqno (35)$$
Furthermore, it follows from the identity $2U↓{n+1}
= 4U↓n + V↓n$ that $\gcd(U↓n, V↓n) ≤ 2$, since any common factor
of $U↓n$ and $V↓n$ must divide $U↓n$ and $2U↓{n+1}$, while $\gcd(U↓n,
U↓{n+1}) = 1$. So $U↓n$ and $V↓n$ have no odd factor in common, and
if $L↓{q-2} = 0$ we must have
$$\baselineskip15pt
\eqalign{U↓{2↑{q-1}}=U↓{2↑{q-2}}V↓{2↑{q-2}}⊗≡0\;\modulo{2↑q-1},\cr
U↓{2↑{q-2}}⊗\neqv0\modulo{2↑q-1}.\cr}$$
Now if $m=m(2↑q-1)$ is the rank of apparition of $2↑q-1$, it must be a divisor
of $2↑{q-1}$ but not of $2↑{q-2}$; thus $m=2↑{q-1}$. We will prove that $n=2↑q-1$
must therefore be prime: Let the factorization of $n$ be $p↓1↑{e↓1}\ldotss
p↓r↑{e↓r}$. All primes $p↓j$ are greater than 3, since $n$ is odd and congruent
to $(-1)↑q-1=-2\modulo 3$. From (30), (33), and (34) we know that $U↓t≡0\modulo
{2↑q-1}$, where
$$t=\lcm\biglp p↓1↑{e↓1-1}(p↓1+ε↓1),\,\ldotss,\,p↓r↑{e↓r-1}(p↓r+ε↓r)\bigrp,$$
and each $ε↓j$ is $\pm1$. Therefore $t$ is a multiple of $m=2↑q-1$. Let $n↓0=
\prod↓{1≤j≤r}p↓{\!j}↑{e↓j-1}(p↓j+ε↓j)$; we have $n↓0≤\prod↓{1≤j≤r}p↓{\!j}
↑{e↓j-1}(p↓j+
{1\over5}p↓j)=({6\over5})↑rn$. Also, because $p↓j+ε↓j$ is even, $t≤n↓0/2↑{r-1}$,
since a factor of two is lost each time the least common multiple of two even
numbers is taken. Combining these results, we have $m≤t≤2({3\over5})↑rn<4({3\over5}
)↑rm<3m$; hence $r≤2$ and $t=m$ or $t=2m$, a power of 2. Therefore $e↓1=1$,
$e↓r=1$, and if $n$ is not prime we must have $n=2↑q-1=(2↑k+1)(2↑l-1)$ where
$2↑k+1$ and $2↑l-1$ are prime. But the latter is obviously impossible when
$q$ is odd, so $n$ is prime.
%folio 512 galley 13 Total loss. (C) Addison-Wesley 1978 *
\def\bslash{\char'477 } % boldface slash (vol. 2 only)
\def\\#1{\sqrt{\hskip1pt\lower1pt\null#1}}
Conversely, suppose that $n = 2↑q
- 1$ is prime; we must show that $V↓{2↑{q-2}} ≡ 0\modulo n$.
For this purpose it suffices to prove that $V↓{2↑{q-1}} ≡ -2
\modulo n$, since $V↓{2↑{q-1}}=(V↓{2↑{q-2}})↑2 - 2$. Now
$$\eqalign{V↓{2↑{q-1}}⊗= \biglp(\\2 + \\6\,)/2\bigrp↑{n+1}
+ \biglp(\\2 - \\6\,)/2\bigrp↑{n+1}\cr
\noalign{\vskip6pt}
⊗= 2↑{-n} \sum↓{k}{n+1\choose2k}\\2↑{\,n+1-2k}\\6↑{\,2k} =
2↑{(1-n)/2}\sum ↓{k}{n+1\choose2k}\,3↑k.\cr}$$
Since $n$ is prime, the binomial coefficient$${n+1\choose2k}={n\choose2k}+
{n\choose2k-1}$$
is divisible by $n$ except when $k = 0$ and $k
= (n + 1)/2$; hence
$$2↑{(n-1)/2\,}V↓{2↑{q-1}} ≡ 1 + 3↑{(n+1)/2}\modulo n.$$
Here $2 ≡ (2↑{(q+1)/2})↑2$, so $2↑{(n-1)/2} ≡
(2↑{(q+1)/2})↑{(n-1)} ≡ 1$ by Fermat's theorem. Finally, by
a simple case of the law of quadratic reciprocity (exercise
1.2.4--47), $3↑{(n-1)/2}≡-1$, since $n\mod 3 =
1$ and $n\mod 4= 3$. This means $V↓{2↑{q-1}} ≡ -2$, so $V↓{2↑{q-2}}≡0$.\quad
\blackslug
\exbegin{EXERCISES}
\exno 1. [10] If the sequence $d↓0$, $d↓1$, $d↓2$, $\ldots$
of trial divisors in Algorithm A contains a number that
is not prime, why will it never appear in the output?
\exno 2. [15] If it is known that the input $N$ to Algorithm
A is equal to 3 or more, could step A2 be eliminated?
\exno 3. [M20] Show that there is a number $P$ with the following
property: If $1000 ≤ n ≤ 1000000$, then $n$ is prime if and only
if $\gcd(n, P) = 1$.
\exno 4. [M24] (J. M. Pollard.)\xskip In the notation of exercise 3.1--7 and
Section 1.2.11.3, prove that the average value of the least $n$ such that
$X↓n=X↓{\,l(n)-1}$ lies between $1.5\,Q(m)-0.5$ and $1.625\,Q(m)-0.5$.
\exno 5. [21] Use Fermat's method (Algorithm D) to find the factors of 10541 by
hand, when the moduli are 3, 5, 7, and 8.
\exno 6. [M24] If $p$ is an odd prime and if $N$ is not a multiple of $p$,
prove that the number of integers $x$ such that $0≤x<p$ and $x↑2-N≡y↑2\modulo p$ has
a solution $y$ is equal to $(p\pm1)/2$.
\exno 7. [25] Discuss the problems of programming the sieve of Algorithm D on
a binary computer when the table entries for modulus $m↓i$ do not exactly fill
an integral number of memory words.
\trexno 8. [23] ({\sl The ``sieve of Eratosthenes,''} 3rd century {\:mB.C.})\xskip
The following procedure evidently discovers all odd prime numbers less than a given
integer $N↓{\null}$,
since it removes all the nonprime numbers: Start with all the odd
numbers less than $N$; then successively strike out the multiples $p↓k↑2$,
$p↓k(p↓k+2)$, $p↓k(p↓k+4)$, $\ldots$, of the $k$th prime $p↓k$, for $k=2$, 3, 4,
$\ldotss$, until reaching a prime $p↓k$ with $p↓k↑2>N$.
Show how to adapt the
procedure just described into an algorithm that is directly suited to
efficient computer calculation, using no multiplication.
\exno 9. [M25] Let $n$ be an odd number, $n ≥ 3$. Show that if
the number $λ(n)$ of Theorem 3.2.1.2B is a divisor of $n - 1$
but not equal to $n - 1$, then $n$ must have the form $p↓1p↓2
\ldotsm p↓t$ where the $p$'s are distinct primes and $t ≥ 3$.
\trexno 10. [M26] (John Selfridge.)\xskip Prove that if, for each prime
divisor $p$ of $n - 1$, there is a number $x↓p$ such that $x↑{(n-1)/p}↓{p}
\mod n ≠ 1$ but $x↑{n-1}↓{p}\mod n = 1$, then $n$ is prime.
\exno 11. [M20] What outputs does Algorithm E give when $N =
197209$, $k = 5$, $m = 1$?\xskip [{\sl Hint:} $\sqrt{\hskip1pt 5 \cdot 197209} = 992
+ \bslash \overline{1, 495, 2, 495, 1, 1984}\bslash$.]
\trexno 12. [M28] Design an algorithm that uses the outputs of
Algorithm E to find a proper factor of $N↓{\null}$, provided that Algorithm E
has produced enough outputs to deduce a solution of (17).
\exno 13. [M27] Given a prime $p$ and a positive integer $d$,
what is the value of $f(p, d)$, the average number of times
$p$ divides $A↑2 - dB↑2$, when $A$ and $B$ are random integers
that are independent except for the condition $\gcd(A, B) = 1$?
\exno 14. [M20] Prove that the number $T$ in step E3 of Algorithm
E will never be a multiple of an odd prime $p$ for which $(kN)↑{(p-1)/2}
\mod p > 1$.
\trexno 15. [M34] (Lucas and Lehmer.)\xskip Let $P$ and $Q$ be relatively
prime integers, and let $U↓0 = 0$, $U↓1 = 1$, $U↓{n+1} = PU↓n -
QU↓{n-1}$ for $n ≥ 1$. Prove that if $N$ is a positive integer
relatively prime to $2P↑2 - 8Q$, and if $U↓{N+1}\mod N
= 0$, while $U↓{(N+1)/p}\mod N ≠ 0$ for each prime $p$ dividing
$N + 1$, then $N$ is prime.\xskip (This gives a test for primality
when the factors of $N + 1$ are known instead of the factors
of $N - 1$. The value of $U↓m$ can be evaluated in $O(\log
m)$ steps; cf.\ exercise 4.6.3--26.)\xskip [{\sl Hint:} See the
proof of Theorem L.]
\exno 16. [M50] Are there infinitely many Mersenne primes?
\exno 17. [M25] (V. R. Pratt.)\xskip A complete proof of primality by the
converse of Fermat's theorem takes the form of a tree whose nodes have the form
$(q,x)$, where $q$ and $x$ are positive integers satisfying the following
arithmetic conditions:\xskip (i) If $(q↓1,x↓1)$, $\ldotss$, $(q↓t,x↓t)$ are the
sons of $(q,x)$ then $q=q↓1\ldotsm q↓k+1$.\xskip [In particular, if $(q,x)$
has no sons then $q=2$.]\xskip(ii) If $(r,y)$ is a son of $(q,x)$ then
$x↑{(q-1)/r}\mod q≠1$.\xskip(iii) For each node $(q,x)$, we have $x↑{q-1}\mod q=1
$.\xskip From these conditions it
follows that $q$ is prime and $x$ is a primitive root modulo $q$, for all
nodes $(q,x)$.\xskip[For example, the tree
$$\baselineskip 24pt\vbox{\halign{\hbox to size{$\hfill#\hfill$}\cr
(1009,11)\cr
(2,1)\qquad(2,1)\qquad(2,1)\qquad(2,1)\qquad(7,3)\qquad(3,2)\qquad(3,2)\qquad\qquad
\cr
\qquad\qquad\qquad\qquad\qquad\qquad(2,1)\qquad(3,2)\qquad(2,1)\qquad(2,1)\qquad\cr
\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad(2,1)\qquad\qquad\qquad\qquad\cr}}$$
demonstrates that 1009 is prime.]\xskip Prove that such a tree with root $(q,x)$ has
at most $f(q)$ nodes, where $f$ is a rather slowly growing function.
\trexno 18. [HM23] Give a
heuristic proof of (7), analogous to the text's derivation of
(6). What is the approximate probability that $p↓{t-1}
≤ \sqrt{\chop to 0pt{p↓t}}\,$?
\trexno 19. [M25] (J. M. Pollard.)\xskip Show how to compute a number
$M$ that is divisible by all primes $p$ such that $p-1$ is a divisor
of some given number
$D$.\xskip [{\sl Hint:} Consider numbers of the form $a↑n - 1$.]\xskip
Such an $M$ is useful in factorization, for by computing $\gcd(M,N)$ we
may discover a factor of $N↓{\null}$.
Extend this idea to an efficient method that has high probability
of discovering prime factors $p$ of a given large number $N↓{\null}$,
when all prime power factors of $p - 1$ are less than $10↑3$
except for at most one prime factor less than $10↑5$.\xskip [For example,
the second-largest prime dividing (14) would be detected by this method, since
it is $1 + 2↑4 \cdot 5↑2 \cdot 67 \cdot 107 \cdot 199 \cdot 41231$.]
\exno 20. [M40] Consider exercise 19 with $p + 1$ replacing
$p - 1$.
\exno 21. [M49] Let $m(p)$ be the number of iterations required by Algorithm B
to cast out the prime factor $p$. Is $m(p)=O(\sqrt{\chop to 0pt p}\,)$ as $p→∞$?
\trexno 22. [M30] (M. O. Rabin.)\xskip Let $p↓n$ be the probability that Algorithm P
guesses wrong, given $n$. Show that $p↓n<{1\over4}$ for all $n$.
\exno 23. [M32] The {\sl Jacobi symbol\/} $({p\over q})$ is defined to be $-1$,
0, or $+1$ for all integers $p≥0$ and all odd integers $q>1$ by the rules
$({p\over q})≡p↑{(q-1)/2}\modulo q$ when $q$ is prime; $({p\over q})=
({p\over q↓1})\ldotsm({p\over q↓t})$ when $q$ is the product $q↓1\ldotsm q↓t$ of
$t$ primes (not necessarily distinct).
\def\\#1{\raise 2pt\hbox{$\scriptstyle#1$}}
\yskip\hang\textindent{a)}Prove that $({p\over q})$ satisfies the following
relationships, hence it can be computed efficiently:\xskip $({\\0\over q})=0$;\xskip
$({\\1\over q})=1$;\xskip $({p\over q})=({p\mod q\over q})$;\xskip
$({2p\over q})=({\\2\over q})({p\over q})$;\xskip $({\\2\over q})=(+1,-1,-1,+1)$
according as $q\mod 8 = (1,3,5,7)$;\xskip $({p\over q})=(-1)↑{(p-1)(q-1)/4}
({q\over p})$ if both $p$ and $q$ are odd.\xskip $\biglp$The latter law, which
is a reciprocity relation reducing the evaluation of $({p\over q})$ to the
evaluation of $({q\over p})$, has been proved in exercise 1.2.4--47(d) when
both $p$ and $q$ are prime, so its validity in that special case may be
assumed here.$\bigrp$
\yskip\hang\textindent{b)}(Solovay and Strassen.)\xskip Prove that if $n$ is odd but
not prime, the number of integers $x$ such that $1≤x<n$ and $0≠({\\x\over n})≡
x↑{(n-1)/2}\modulo n$ is at most ${1\over2}\varphi(n)$.\xskip$\biglp$Thus,
the following testing procedure correctly determines whether or not a given
$n$ is prime, with probability $≥{1\over2}$ for all fixed $n$: ``Generate $x$
at random with $1≤x<n$. If $0≠x↑{(n-1)/2}≡({\\x\over n})\modulo n$, say that
$n$ is probably prime, otherwise say that $n$ is definitely not prime.''$\bigrp$
\trexno 24. [M25] (L. Adleman.)\xskip When $n>1$ and $x>1$ are integers, $n$ odd,
let us say that $n$ ``passes the $x$
test of Algorithm P'' if either $x=n$ or if steps
P2--P5 lead to the conclusion that $n$ is probably prime.
Prove that, for any $N$, there exists a set of
positive integers
$x↓1$, $\ldotss$, $x↓m≤N$ with $m≤\lfloor\lg N\rfloor$ such that a positive odd
integer in the range
$1<n≤N$ is prime if and only if it passes the $x$ test of Algorithm P for
$x=x↓1\mod n$, $\ldotss$, $x=x↓m\mod n$. Thus, the probabilistic test for
primality can in principle be converted into a efficient test that leaves nothing
to chance.\xskip(You need not show how to compute the $x↓j$ efficiently; just
prove that they exist.)
\trexno 25. [M35] (A. Shamir.)\xskip Consider an abstract computer that can perform
the operations $x+y$, $x-y$, $x\cdot y$, and $\lfloor x/y\rfloor$ on integers
$x$ and $y$ of arbitrary length, in just one unit of time, no matter how large
those integers are. The machine stores integers in a random-access memory and it
can select different program steps depending on whether or not $x=y$, given
$x$ and $y$. The purpose of this exercise is to demonstrate that there is
an amazingly fast way to factorize numbers on such a computer.\xskip
$\biglp$Therefore it will
probably be quite difficult to show that factorization is inherently difficult
on {\sl real\/} machines, although we suspect that it is.$\bigrp$
\yskip\hang\textindent{a)}Find a way to compute $n!$ in $O(\log n)$ steps on
such a computer, given an integer value $n≥2$.\xskip[{\sl Hint:} If $
A$ is a sufficiently large integer, the binomial coefficients ${m\choose k}=
m!/(m-k)!\,k!$ can be computed readily from the value of $(A+1)↑m$.]
\yskip\hang\textindent{b)}Show how to compute a number $f(n)$ in $O(\log n)$ steps
on such a computer, given an integer value $n≥2$, having the following
properties:\xskip$f(n)=n$ if $n$ is
prime, otherwise $f(n)$ is a proper (but not necessarily prime) divisor of
$n$.\xskip[{\sl Hint:} If $n≠4$, one such function $f(n)$ is $\gcd\biglp m(n),
n\bigrp$, where $m(n)=\min\leftset m\relv m!\mod n=0\rightset$.]
\yskip\noindent$\biglp$As a consequence of (b), we can completely factor a given
number $n$ by doing only $O(\log n)↑2$ arithmetic operations on arbitrarily
large integers: Given a partial factorization $n=n↓1\ldotsm n↓r$, each
nonprime $n↓i$ can be replace by $f(n↓i)\cdot\biglp n↓i/f(n↓i)\bigrp$ in
$\sum O(\log n↓i)=O(\log n)$ steps, and this refinement operation can be
repeated until all $n↓i$ are prime.$\bigrp$
\vfill\eject